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SEQUENCE LISTING 

(1) GENERAL INFORMATION: 
(i) APPLICANT: 



# 



(ii) TITLE OF INVENTION: 

(iii) NUMBER OF SEQUENCES: 19 



(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Foley, Hoag & Eliot LLP 

(B) STREET: One Post Office Square 

(C) CITY: Boston 
20 (D) STATE: MA 

(E) COUNTRY: US 

(F) ZIP : 02109 

(v) COMPUTER READABLE FORM: 

^25 (A) MEDIUM TYPE: Floppy disk 

? =~ (B) COMPUTER: IBM PC compatible 

^ (C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.30 

U=30 (vi) CURRENT APPLICATION DATA: 
V (A) APPLICATION NUMBER: 

2T (B) FILING DATE: 

^ (C) CLASSIFICATION: 

Q 35 (viii) ATTORNEY/AGENT INFORMATION: 

02 (A) NAME: Vincent, Matthew P. 

~1 (B) REGISTRATION NUMBER: 36,709 

(C) REFERENCE/DOCKET NUMBER: SUV003.04 

%0 40 (ix) TELECOMMUNICATION INFORMATION: 
HI (A) TELEPHONE: 617-832-1000 

(B) TELEFAX: 617-832-7000 

45 (2) INFORMATION FOR SEQ ID NO : 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 736 base pairs 

(B) TYPE: nucleic acid 
50 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

55 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 

AACNNCNNTN NATGGCACCC CCNCCCAACC TTTNNNCCNN NTAANCAAAA NNCCCCNTTT 60 

NATACCCCCT NTAANANTTT TCCACCNNNC NNAAANNCCN CTGNANACNA NGNAAANCCN 120 

TTTTTNAACC CCCCCCACCC GGAATTCCNA NTNNCCNCCC CCAAATTACA ACTCCAGNCC 18 0 

60 



• * 



360 
420 
480 



600 
660 

720 
736 



AAAATTNANA NAATTGGTCC TAACCTAACC NATNGTTGTT ACGGTTTCCC CCCCCAAATA 240 
CATGCACTGG CCCGAACACT TGATCGTTGC CGTTCCAATA AGAATAAATC TGGTCATATT 300 
AAACAAGCCN AAAGCTTTAC AAACTGTTGT ACAATTAATG GGCGAACACG AACTGTTCGA 
ATTCTGGTCT GGACATTACA AAGTGCACCA CATCGGATGG AACCAGGAGA AGGCCACAAC 
CGTACTGAAC GCCTGGCAGA AGAAGTTCGC ACAGGTTGGT GGTTGGCGCA AGGAGTAGAG 
TGAATGGTGG TAATTTTTGG TTGTTCCAGG AGGTGGATCG TCTGACGAAG AGCAAGAAGT 540 
CGTCGAATTA CATCTTCGTG ACGTTCTCCA CCGCCAATTT GAACAAGATG TTGAAGGAGG 
CGTCGAANAC GGACGTGGTG AAGCTGGGGG TGGTGCTGGG GGTGGCGGCG GTGTACGGGT 
GGGTGGCCCA GTCGGGGCTG GCTGCCTTGG GAGTGCTGGT CTTNGCGNGC TNCNATTCGC 
CCTATAGTNA GNCGTA 

(2) INFORMATION FOR SEQ ID NO : 2 : 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 107 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 : 

Xaa Pro Pro Pro Asn Tyr Asn Ser Xaa Pro Lys Xaa Xaa Xaa Leu Val 
15 10 15 

Leu Thr Pro Xaa Val Val Thr Val Ser Pro Pro Lys Tyr Met His Trp 
20 25 30 

Pro Glu His Leu He Val Ala Val Pro He Arg He Asn Leu Val He 
35 40 45 

Leu Asn Lys Pro Lys Ala Leu Gin Thr Val Val Gin Leu Met Gly Glu 
50 55 60 

His Glu Leu Phe Glu Phe Trp Ser Gly His Tyr Lys Val His His He 
65 70 75 80 

Glv Trp Asn Gin Glu Lys Ala Thr Thr Val Leu Asn Ala Trp Gin Lys 
85 90 95 

Lys Phe Ala Gin Val Gly Gly Trp Arg Lys Glu 
100 105 

(2) INFORMATION FOR SEQ ID NO : 3 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5187 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: cDNA 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 : 





GGGTCTGTCA 


CCCGGAGCCG 


GAGTCCCCGG 


CGGCCAGCAG CGTCCTCGCG 


AGCCGAGCGC 


60 




CCAGGCGCGC 


CCGGAGCCCG 


CGGCGGCGGC 


GGCAACATGG 


CCTCGGCTGG 


TAACGCCGCC 


120 




GGGGCCCTGG 


GCAGGCAGGC 


CGGCGGCGGG 


AGGCGCAGAC 


GGACCGGGGG 


ACCGCACCGC 


180 




GCCGCGCCGG 


ACCGGGACTA 


TCTGCACCGG 


CCCAGCTACT 


GCGACGCCGC 


CTTCGCTCTG 


240 




GAGCAGATTT 


CCAAGGGGAA 


GGCTACTGGC 


CGGAAAGCGC 


CGCTGTGGCT 


GAGAGCGAAG 


300 




TTTCAGAGAC 


TCTTATTTAA 


ACTGGGTTGT 


TACATTCAAA 


AGAACTGCGG 


CAAGTTTTTG 


360 




GTTGTGGGTC 


TCCTCATATT 


TGGGGCCTTC 


GCTGTGGGAT 


TAAAGGCAGC 


TAATCTCGAG 


420 


00 


ACCAACGTGG 


AGGAGCTGTG 


GGTGGAAGTT 


GGTGGACGAG 


TGAGTCGAGA 


ATTAAATTAT 


480 


J3 


ACCCGTCAGA 


AGATAGGAGA 


AGAGGCTATG 


TTTAATCCTC 


AACTCATGAT 


ACAGACTCCA 


540 




AAAGAAGAAG 


GCGCTAATGT 


TCTGACCACA 


GAGGCTCTCC 


TGCAACACCT 


GGACTCAGCA 


600 


sL^,. 


CTCCAGGCCA 


GTCGTGTGCA 


CGTCTACATG 


TATAACAGGC 


AATGGAAGTT 


GGAACATTTG 


660 




TGCTACAAAT 


CAGGGGAACT 


TATCACGGAG 


ACAGGTTACA 


TGGATCAGAT 


AATAGAATAC 


720 


CO 


CTTTACCCTT 


GCTTAATCAT 


TACACCTTTG 


GACTGCTTCT 


GGGAAGGGGC 


AAAGCTACAG 


780 


J. ^^VjVJOri^rVO 


GATACCTCCT 


AGGTAAGCCT 


CCTTTACGGT 


GGACAAACTT 


TGACCCCTTG 


840 


M 


GAATTPCTAG 


AAG AGT T AAA 


GAAAATAAAC 


TACCAAGTGG 


ACAGCTGGGA 


GGAAATGCTG 


900 




AATAAAGCCG 


AAGTTGGCCA 


TGGGTACATG 


GACCGGCCTT 


GCCTCAACCC 


AGCCGACCCA 


960 






CCACAGCCCC 


TAACAAAAAT 


TCAACCAAAC 


CTCTTGATGT 


GGCCCTTGTT 


1020 




TTGAATGGTG 


GATGTCAAGG 


TTTATCCAGG 


AAGTATATGC 


ATTGGCAGGA 


GGAGTTGATT 


1080 




GTGGGTGGTA 


CCGTCAAGAA 


TGCCACTGGA 


AAACTTGTCA 


GCGCTCACGC 


CCTGCAAACC 


1140 




ATGTTCCAGT 


TAATGACTCC 


CAAGCAAATG 


TATGAACACT 


TCAGGGGCTA 




1200 




TCTCACATCA 
TACGTGGAGG 


ACTGGAATGA 
TGGTTCATCA 


AGACAGGGCA 
AAGTGTCGCC 


GCCGCCATCC 
CCAAACTCCA 


TGGAGGCCTG 
CTCAAAAGGT 


GCAGAGGACT 
GCTTCCCTTC 


1260 
1320 




ACAACCACGA 


CCCTGGACGA 


CATCCTAAAA 


TCCTTCTCTG 


ATGTCAGTGT 


CATCCGAGTG 


1380 




GCCAGCGGCT 


ACCTACTGAT 


GCTTGCCTAT 


GCCTGTTTAA 


CCATGCTGCG 


CTGGGACTGC 


1440 




TCCAAGTCCC 


AGGGTGCCGT 


GGGGCTGGCT 


GGCGTCCTGT 


TGGTTGCGCT 


GTCAGTGGCT 


1500 




GCAGGATTGG 


GCCTCTGCTC 


CTTGATTGGC 


ATTTCTTTTA 


ATGCTGCGAC 


AACTCAGGTT 


1560 




TTGCCGTTTC 


TTGCTCTTGG 


TGTTGGTGTG 


GATGATGTCT 


TCCTCCTGGC 


CCATGCATTC 


1620 




AGTGAAACAG 


GACAGAATAA 


GAGGATTCCA 


TTTGAGGACA 


GGACTGGGGA 


GTGCCTCAAG 


1680 






CGCACCGGAG 


CCAGCGTGGC 


CCTCACCTCC 


ATCAGCAATG 


TCACCGCCTT 


CTTCATGGCC 


1740 




GCATTGATCC 


CTATCCCTGC 


CCTGCGAGCG 


TTCTCCCTCC 


AGGCTGCTGT 


GGTGGTGGTA 


1800 




TTCAATTTTG 


CTATGGTTCT 


GCTCATTTTT 


CCTGCAATTC 


TCAGCATGGA 


TTTATACAGA 


1860 




CGTGAGGACA 


GAAGATTGGA 


TATTTTCTGC 


TGTTTCACAA 


GCCCCTGTGT 


CAGCAGGGTG 


1920 




ATTCAAGTTG 


AGCCACAGGC 


CTACACAGAG 


CCTCACAGTA 


ACACCCGGTA 


CAGCCCCCCA 


1980 




CCCCCATACA 


CCAGCCACAG 


CTTCGCCCAC 


GAAACCCATA 


TCAGTATGCA 


GTCCACCGTT 


2040 




CAGCTCCGCA 


CAGAGTATGA 


CCCTCACACG 


CACGTGTAfcT ACACCACCGC 


CGAGCCACGC 


2100 




TCTGAGATCT 


CTGTACAGCC 


TGTTACCGTC 


ACCCAGGACA 


ACCTCAGCTG 


TCAGAGTCCC 


2160 




GAGAGCACCA 


GCTCTACCAG 


GGACCTGCTC 


TCCCAGTTCT 


CAGACTCCAG 


CCTCCACTGC 


2220 




CTCGAGCCCC 


CCTGCACCAA 


GTGGACACTC 


TCTTCGTTTG 


CAGAGAAGCA 


CTATGCTCCT 


2280 




TTCCTCCTGA 


AACCCAAAGC 


CAAGGTTGTG 


GTAATCCTTC 


TTTTCCTGGG 


CTTGCTGGGG 


2340 




GTCAGCCTTT 


ATGGGACCAC 


CCGAGTGAGA 


GACGGGCTGG 


ACCTCACGGA 


CATTGTTCCC 


2400 




CGGGAAACCA 


GAGAATATGA 


CTTCATAGCT 


GCCCAGTTCA 


AGTACTTCTC 


TTTCTACAAC 


2460 




ATGTATATAG 


TCACCCAGAA 


AGCAGACTAC 


CCGAATATCC 


AGCACCTACT 


TTACGACCTT 


2520 




CATAAGAGTT 


TCAGCAATGT 


GAAGTATGTC 


ATGCTGGAGG 


AGAACAAGCA 


ACTTCCCCAA 


2580 




ATGTGGCTGC 


ACTACTTTAG 


AGACTGGCTT 


CAAGGACTTC 


AGGATGCATT 


TGACAGTGAC 


2640 


BO 


TGGGAAACTG 


GGAGGATCAT 


GCCAAACAAT 


TATAAAAATG 


GATCAGATGA 


CGGGGTCCTC 


2700 




GCTTACAAAC 


TCCTGGTGCA 


GACTGGCAGC 


CGAGACAAGC 


CCATCGACAT 


TAGTCAGTTG 


2760 




ACTAAACAGC 


GTCTGGTAGA 


CGCAGATGGC 


ATCATTAATC 


CGAGCGCTTT 


CTACATCTAC 


2820 




CTGACCGCTT 


GGGTCAGCAA 


CGACCCTGTA 


GCTTACGCTG 


CCTCCCAGGC 


CAACATCCGG 


2880 




CCTCACCGGC 


CGGAGTGGGT 


CCATGACAAA 


GCCGACTACA 


TGCCAGAGAC 


CAGGCTGAGA 


2940 




ATCCCAGCAG 


cagagcccat" 


CGAGTACGCT 


CAGTTCCCTT 


TCTACCTCAA 


CGGCCTACGA 


3000 




GACACCTCAG 


ACTTTGTGGA 


AGCCATAGAA 


AAAGTGAGAG 


TCATCTGTAA 


CAACTATACG 


3060 




AGCCTGGGAC 


TGTCCAGCTA 


CCCCAATGGC 


TACCCCTTCC 


TGTTCTGGGA 


GCAATACATC 


3120 




AGCCTGCGCC 


ACTGGCTGCT 


GCTATCCATC 


AGCGTGGTGC 


TGGCCTGCAC 


GTTTCTAGTG 


3180 




TGCGCAGTCT 


TCCTCCTGAA 


CCCCTGGACG 


GCCGGGATCA 


TTGTCATGGT 


CCTGGCTCTG 


3240 




ATGACCGTTG 


AGCTCTTTGG 


CATGATGGGC 'CTCATTGGGA 


TCAAGCTGAG 


TGCTGTGCCT 


3300 




GTGGTCATCC 


TGATTGCATC 


TGTTGGCATC 


GGAGTGGAGT 


TCACCGTCCA 


CGTGGCTTTG 


3360 




GCCTTTCTGA 


CAGCCATTGG 


GGACAAGAAC 


CACAGGGCTA 


TGCTCGCTCT 


GGAACACATG 


3420 




TTTGCTCCCG 


TTCTGGACGG 


TGCTGTGTCC 


ACTCTGCTGG 


GTGTACTGAT 


GCTTGCAGGG 


3480 




TCCGAATTTG 


ATTTCATTGT 


CAGATACTTC 


TTTGCCGTCC 


TGGCCATTCT 


CACCGTCTTG 


3540 




GGGGTTCTCA 


ATGGACTGGT 


TCTGCTGCCT 


GTCCTCTTAT 


CCTTCTTTGG 


ACCGTGTCCT 


3600 



ft 



GAGGTGTCTC CAGCCAATGG CCTAAACCGA CTGCCCACTC CTTCGCCTGA GCCGCCTCCA 3660 

AGTGTCGTCC GGTTTGCCGT GCCTCCTGGT CACACGAACA ATGGGTCTGA TTCCTCCGAC 3720 

TCGGAGTACA GCTCTCAGAC CACGGTGTCT GGCATCAGTG AGGAGCTCAG GCAATACGAA 37 80 

GCACAGCAGG GTGCCGGAGG CCCTGCCCAC CAAGTGATTG TGGAAGCCAC AGAAAACCCT 3840 

GTCTTTGCCC GGTCCACTGT GGTCCATCCG GACTCCAGAC ATCAGCCTCC CTTGACCCCT 3900 

CGGCAACAGC CCCACCTGGA CTCTGGCTCC TTGTCCCCTG GACGGCAAGG CCAGCAGCCT 3960 

CGAAGGGATC CCCCTAGAGA AGGCTTGCGG CCACCCCCCT ACAGACCGCG CAGAGACGCT 4020 

TTTGAAATTT CTACTGAAGG GCATTCTGGC CCTAGCAATA GGGACCGCTC AGGGCCCCGT 4080 

GGGGCCCGTT CTCACAACCC TCGGAACCCA ACGTCCACCG CCATGGGCAG CTCTGTGCCC 4140 

AGCTACTGCC AGCCCATCAC CACTGTGACG GCTTCTGCTT CGGTGACTGT TGCTGTGCAT 4200 

CCCCCGCCTG GACCTGGGCG CAACCCCCGA GGGGGGCCCT GTCCAGGCTA TGAGAGCTAC 42 60 

CCTGAGACTG ATCACGGGGT ATTTGAGGAT CCTCATGTGC CTTTTCATGT CAGGTGTGAG 4 32 0 

AGGAGGGACT CAAAGGTGGA GGTCATAGAG CTACAGGACG TGGAATGTGA GGAGAGGCCG 4380 

TGGGGGAGCA GCTCCAACTG AGGGTAATTA AAATCTGAAG CAAAGAGGCC AAAGATTGGA 4 4 40 

AAGCCCCGCC CCCACCTCTT TCCAGAACTG CTTGAAGAGA ACTGCTTGGA ATTATGGGAA 4 500 

GGCAGTTCAT TGTTACTGTA ACTGATTGTA TTATTKKGTG AAATATTTCT ATAAATATTT 4560 

AARAGGTGTA CACATGTAAT ATACATGGAA ATGCTGTACA GTCTATTTCC TGGGGCCTCT 4 620 

CCACTCCTGC CCCAGAGTGG GGAGACCACA GGGGCCCTTT CCCCTGTGTA CATTGGTCTC 4 680 

TGTGCCACAA CCAAGCTTAA CTTAGTTTTA AAAAAAATCT CCCAGCATAT GTCGCTGCTG 47 4 0 

CTTAAATATT GTATAATTTA CTTGTATAAT TCTATGCAAA TATTGCTTAT GTAATAGGAT 4 800 

TATTTGTAAA GGTTTCTGTT TAAAATATTT TAAATTTGCA TATCACAACC CTGTGGTAGG 4 860 

ATGAATTGTT ACTGTTAACT TTTGAACACG CTATGCGTGG TAATTGTTTA ACGAGCAGAC 4 920 

ATGAAGAAAA CAGGTTAATC CCAGTGGCTT CTCTAGGGGT AGTTGTATAT GGTTCGCATG 4 980 

GGTGGATGTG TGTGTGCATG TGACTTTCCA ATGTACTGTA TTGTGGTTTG TTGTTGTTGT 504 0 

TGCTGTTGTT GTTCATTTTG GTGTTTTTGG TTGCTTTGTA TGATCTTAGC TCTGGCCTAG 5100 

GTGGGCTGGG AAGGTCCAGG TCTTTTTCTG TCGTGATGCT GGTGGAAAGG TGACCCCAAT 5160 

CATCTGTCCT ATTCTCTGGG ACTATTC 5187 
(2) INFORMATION FOR SEQ ID NO : 4 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1311 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

Met Val Ala Pro Asp Ser Glu Ala Pro Ser Asn Pro Arg He Thr Ala 
1 5 10 15 

Ala His Glu Ser Pro Cys Ala Thr Glu Ala Arg His Ser Ala Asp Leu 
20 25 3° 

Tyr He Arg Thr Ser Trp Val Asp Ala Ala Leu Ala Leu Ser Glu Leu 
35 40 45 

Glu Lys Gly Asn He Glu Gly Gly Arg Thr Ser Leu Trp He Arg Ala 
50 55 60 

Trp Leu Gin Glu Gin Leu Phe He Leu Gly Cys Phe Leu Gin Gly Asp 
65 70 75 80 

Ala Glv Lys Val Leu Phe Val Ala He Leu Val Leu Ser Thr Phe Cys 
85 90 95 

Val Gly Leu Lys Ser Ala Gin He His Thr Arg Val Asp Gin Leu Trp 
100 105 HO 

Val Gin Glu Gly Gly Arg Leu Glu Ala Glu Leu Lys Tyr Thr Ala Gin 
115 120 125 

Ala Leu Gly Glu Ala Asp Ser Ser Thr His Gin Leu Val He Gin Thr 
130 135 140 

Ala Lys Asp Pro Asp Val Ser Leu Leu His Pro Gly Ala Leu Leu Glu 
145 150 155 160 

His Leu Lys Val Val His Ala Ala Thr Arg Val Thr Val His Met Tyr 
165 170 175 

Asp He Glu Trp Arg Leu Lys Asp Leu Cys Tyr Ser Pro Ser He Pro 
180 185 190 

Asp Phe Glu Gly Tyr His His He Glu Ser He He Asp Asn Val He 
195 200 205 

Pro Cys Ala He He Thr Pro Leu Asp Cys Phe Trp Glu Gly Ser Lys 
210 215 220 

Leu Leu Gly Pro Asp Tyr Pro He Tyr Val Pro His Leu Lys His Lys 
225 230 235 240 

Leu Glri Trp Thr His Leu Asn Pro Leu Glu Val Val Glu Glu Val Lys 
245 250 255 

Lvs Leu Lys Phe Gin Phe Pro Leu Ser Thr He Glu Ala Tyr Met Lys 
260 265 270 

Arg Ala Gly He Thr Ser Ala Tyr Met Lys Lys Pro Cys Leu Asp Pro 
275 280 285 

Thr Asp Pro His Cys Pro Ala Thr Ala Pro Asn Lys Lys Ser Gly His 
290 295 300 



lie Pro Asp Val Ala Ala Glu Leu Ser His Gly Cys Tyr Gly Phe Ala 
305 310 315 320 

Ala Ala Tyr Met His Trp Pro Glu Gin Leu lie Val Gly Gly Ala Thr 
325 330 335 

Arg Asn Ser Thr Ser Ala Leu Arg Lys Ala Arg Xaa Leu Gin Thr Val 
340 345 350 

Val Gin Leu Met Gly Glu Arg Glu Met Tyr Glu Tyr Trp Ala Asp His 
355 360 365 

Tyr Lys Val His Gin He Gly Trp Asn Gin Glu Lys Ala Ala Ala Val 
370 375 380 

Leu Asp Ala Trp Gin Arg Lys Phe Ala Ala Glu Val Arg Lys He Thr 
385 390 395 400 

Thr Ser Gly Ser Val Ser Ser Ala Tyr Ser Phe Tyr Pro Phe Ser Thr 
405 410 415 

Ser Thr Leu Asn Asp He Leu Gly Lys Phe Ser Glu Val Ser Leu Lys 
420 425 430 

Asn He He Leu Gly Tyr Met Phe Met Leu He Tyr Val Ala Val Thr 
435 440 445 

Leu He Gin Trp Arg Asp Pro He Arg Ser Gin Ala Gly Val Gly He 
450 455 460 

Ala Gly Val Leu Leu Leu Ser He Thr Val Ala Ala Gly Leu Gly Phe 
465 470 475 480 

Cys Ala Leu Leu Gly He Pro Phe Asn Ala Ser Ser Thr Gin He Val 
485 490 495 

Pro Phe Leu Ala Leu Gly Leu Gly Val Gin Asp Met Phe Leu Leu Thr 
500 505 510 

His Thr Tyr Val Glu Gin Ala Gly Asp Val Pro Arg Glu Glu Arg Thr 
515 520 525 

Gly Leu Val Leu Lys Lys Ser Gly Leu Ser Val Leu Leu Ala Ser Leu 
530 535 540 

Cys Asn Val Met Ala Phe Leu Ala Ala Ala Leu Leu Pro He Pro Ala 
545 550 555 560 

Phe Arg Val Phe Cys Leu Gin Ala Ala He Leu Leu Leu Phe Asn Leu 
565 570 575 

Gly Ser He Leu Leu Val Phe Pro Ala Met He Ser Leu Asp Leu Arg 
580 585 590 

Arg Arg Ser Ala Ala Arg Ala Asp Leu Leu Cys Cys Leu Met Pro Glu 
595 600 605 

Ser Pro Leu Pro Lys Lys Lys He Pro Glu Arg Ala Lys Thr Arg Lys 
610 615 620 

Asn Asp Lys Thr His Arg He Asp Thr Thr Arg Gin Pro Leu Asp Pro 
625 630 635 640 



5( 

# 



Asp Val Ser Glu Asn Val Thr Lys Thr Cys Cys Leu Ser Val Ser Leu 
645 650 655 

Thr Lys Trp Ala Lys Asn Gin Tyr Ala Pro Phe lie Met Arg Pro Ala 
660 665 670 

Val Lys Val Thr Ser Met Leu Ala Leu He Ala Val He Leu Thr Ser 
675 680 685 

Val Trp Gly Ala Thr Lys Val Lys Asp Gly Leu Asp Leu Thr Asp He 
690 695 700 

Val Pro Glu Asn Thr Asp Glu His Glu Phe Leu Ser Arg Gin Glu Lys 
705 710 715 720 

Tvr Phe Gly Phe Tyr Asn Met Tyr Ala Val Thr Gin Gly Asn Phe Glu 
725 730 735 

Tyr Pro Thr Asn Gin Lys Leu Leu Tyr Glu Tyr His Asp Gin Phe Val 
740 745 750 

Arg He Pro Asn He He Lys Asn Asp Asn Gly Gly Leu Thr Lys Phe 
755 760 765 

Trp Leu Ser Leu Phe Arg Asp Trp Leu Leu Asp Leu Gin Val Ala Phe 
770 775 780 

Asp Lys Glu Val Ala Ser Gly Cys He Thr Gin Glu Tyr Trp Cys Lys 
785 790 795 800 

Asn Ala Ser Asp Glu Gly He Leu Ala Tyr Lys Leu Met Val Gin Thr 
805 810 815 

Gly His Val Asp Asn Pro He Asp Lys Ser Leu He Thr Ala Gly His 
820 825 830 

Arg Leu Val Asp Lys Asp Gly He He Asn Pro Lys Ala Phe Tyr Asn 
835 840 845 

Tyr Leu Ser Ala Trp Ala Thr Asn Asp Ala Leu Ala Tyr Gly Ala Ser 
850 855 860 

Gin Gly Asn Leu Lys Pro Gin Pro Gin Arg Trp lie His Ser Pro Glu 
865 870 875 880 

Asp Val His Leu Glu He Lys Lys Ser Ser Pro Leu lie Tyr Thr Gin 
885 890 895 

Leu Pro Phe Tyr Leu Ser Gly Leu Ser Asp Thr Xaa Ser He Lys Thr 
900 905 910 

Leu lie Arg Ser Val Arg Asp Leu Cys Leu Lys Tyr Glu Ala Lys Gly 
915 920 925 

Leu Pro Asn Phe Pro Ser Gly lie Pro Phe Leu Phe Trp Glu Gin Tyr 
930 935 940 

Leu Tyr Leu Arg Thr Ser Leu Leu Leu Ala Leu Ala Cys Ala Leu Ala 
945 950 955 960 

Ala Val Phe He Ala Val Met Val Leu Leu Leu Asn Ala Trp Ala Ala 
965 970 975 



Val Leu Val Thr Leu Ala Leu Ala Thr Leu Val Leu Gin Leu Leu Gly 
980 985 990 

Val Met Ala Leu Leu Gly Val Lys Leu Ser Ala Met Pro Ala Val Leu 
995 1000 1005 

Leu Val Leu Ala lie Gly Arg Gly Val His Phe Thr Val His Leu Cys 
1010 1015 1020 

Leu Gly Phe Val Thr Ser He Gly Cys Lys Arg Arg Arg Ala Ser Leu^ 
1025 1030 1° 35 

Ala Leu Glu Ser Val Leu Ala Pro Val Val His Gly Ala Leu Ala Ala 

1045 1050 1055 

Ala Leu Ala Ala Ser Met Leu Ala Ala Ser Glu Cys Gly Phe Val Ala 
1060 1065 1070 

Arg Leu Phe Leu Arg Leu Leu Leu Asp He Val Phe Leu Gly Leu He 
1075 1080 1085 

2 Asp Gly Leu Leu Phe Phe Pro He Val Leu Ser He Leu Gly Pro Ala 

W 1Q l 0 1095 1100 

C Ala Glu Val Arg Pro He Glu His Pro Glu Arg Leu Ser Thr Pro Ser 

yt H05 1110 HI 5 1120 

C Pro Lys Cys Ser Pro He His Pro Arg Lys Ser Ser Ser Ser Ser Gly 

n 1125 H30 II 35 

L Gly Gly Asp Lys Ser Ser Arg Thr Ser Lys Ser Ala Pro Arg Pro Cys 

P 1140 1145 H50 

5 Ala Pro Ser Leu Thr Thr He Thr Glu Glu Pro Ser Ser Trp His Ser 

Li 1155 116° 1165 

^ Ser Ala His Ser Val Gin Ser Ser Met Gin Ser He Val Val Gin Pro 

^ 1170 1175 1180 

Glu Val Val Val Glu Thr Thr Thr Tyr Asn Gly Ser Asp Ser Ala Ser 
1185 1190 1195 1200 

Gly Arg Ser Thr Pro Thr Lys Ser Ser His Gly Gly Ala He Thr Thr 
1205 1210 1215 

Thr Lys Val Thr Ala Thr Ala Asn He Lys Val Glu Val Val Thr Pro 
12 20 1225 1230 

Ser Asp Arg Lys Ser Arg Arg Ser Tyr His Tyr Tyr Asp Arg Arg Arg 
1235 1240 1245 

Asp Arg Asp Glu Asp Arg Asp Arg Asp Arg Glu Arg Asp Arg Asp Arg 
1250 1255 1260 

Asp Arg Asp Arg Asp Arg Asp Arg Asp Arg Asp Arg Asp Arg Asp Arg 
12 65 1270 1275 1280 

Glu Arg Ser Arg Glu Arg Asp Arg Arg Asp Arg Tyr Arg Asp Glu Arg 
1285 1290 1295 

Asp His Arg Ala Ser Pro Arg Glu Lys Arg Gin Arg Phe Trp Thr 
1300 1305 1310 



(2) INFORMATION FOR SEQ ID NO : 5 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4434 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: CDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 : 

CGAAACAAGA GAGCGAGTGA GAGTAGGGAG AGCGTCTGTG TTGTGTGTTG AGTGTCGCCC 60 

ACGCACACAG GCGCAAAACA GT GC AC AC AG ACGCCCGCTG GGCAAGAGAG AGTGAGAGAG 12 0 

AGAAACAGCG GCGCGCGCTC GCCTAATGAA GTTGTTGGCC TGGCTGGCGT GCCGCATCCA 180 

F% CGAGATACAG ATACATCTCT CATGGACCGC GACAGCCTCC CACGCGTTCC GGACACACAC 2 40 

^ GGCGATGTGG TCGATGAGAA ATTATTCTCG GATCTTTACA TACGCACCAG CTGGGTGGAC 3 00 

H* OCCCAAGTGG CGCTCGATCA GATAGATAAG GGCAAAGCGC GTGGCAGCCG CACGGCGATC 36C 

~? = 

H* TATCTGCGAT CAGTATTCCA GTCCCACCTC GAAACCCTCG GCAGCTCCGT GCAAAAGCAC 42 0 

Q GCGGGCAAGG TGCTATTCGT GGCTATCCTG GTGCTGAGCA CCTTCTGCGT CGGCCTGAAG 48 0 

AGCGCCCAGA TCCACTCCAA GGTGCACCAG CTGTGGATCC AGGAGGGCGG CCGGCTGGAG 540 

5 GCGGAACTGG CCTACACACA GAAGACGATC GGCGAGGACG AGTCGGCCAC GCATCAGCTG 60 0 

fA CTCATTCAGA CGACCCACGA CCCGAACGCC TCCGTCCTGC ATCCGCAGGC GCTGCTTGCC 6 60 

- f% 

O CACCTGGAGG TCCTGGTCAA GGCCACCGCC GTCAAGGTGC ACCTCTACGA CACCGAATGG 72 0 

GGGCTGCGCG ACATGTGCAA CATGCCGAGC ACGCCCTCCT TCGAGGGCAT CTACTACATC 780 

GAGCAGATCC TGCGCCACCT CATTCCGTGC TCGATCATCA CGCCGCTGGA CTGTTTCTGG 8 40 

GAGGGAAGCC AGCTGTTGGG TCCGGAATCA GCGGTCGTTA TACCAGGCCT CAACCAACGA 90 0 

CTCCTGTGGA CCACCCTGAA TCCCGCCTCT GTGATGCAGT ATATGAAACA AAAGATGTCC 960 

GAGGAAAAGA TCAGCTTCGA CTTCGAGACC GTGGAGCAGT ACATGAAGCG TGCGGCCATT 102 0 

GGCAGTGGCT ACATGGAGAA GCCCTGCCTG AACCCACTGA ATCCCAATTG CCCGGACACG 108 0 

GCACCGAACA AGAACAGCAC CCAGCCGCCG GATGTGGGAG CCATCCTGTC CGGAGGCTGC 114 0 

TACGGTTATG CCGCGAAGCA CATGCACTGG CCGGAGGAGC TGATTGTGGG CGGACGGAAG 12 0 0 

AGGAACCGCA GCGGACACTT GAGGAAGGCC CAGGCCCTGC AGTCGGTGGT GCAGCTGATG 12 60 

ACCGAGAAGG AAATGTACGA CCAGTGGCAG GACAACTACA AGGTGCACCA TCTTGGATGG 132 0 

ACGCAGGAGA AGGCAGCGGA GGTTTTGAAC GCCTGGCAGC GCAACTTTTC GCGGGAGGTG 138 0 

GAACAGCTGC TACGTAAACA GTCGAGAATT GCCACCAACT ACGATATCTA CGTGTTCAGC 144 0 






TCGGCTGCAC 


TGGATGACAT 


CCTGGCCAAG 


TTCTCCCATC 


CCAGCGCCTT 


GTCCATTGTC 


1500 




ATCGGCGTGG 


CCGTCACCGT 


TTTGTATGCC 


TTTTGCACGC 


TCCTCCGCTG 


GAGGGACCCC 


1560 




GTCCGTGGCC 


AGAGCAGTGT 


GGGCGTGGCC 


GGAGTTCTGC 


TCATGTGCTT 


CAGTACCGCC 


1620 




GCCGGATTGG 


GATTGTCAGC 


CCTGCTCGGT 


ATCGTTTTCA 


ATGCGCTGAC 


CGCTGCCTAT 


1680 




GCGGAGAGCA 


ATCGGCGGGA 


GCAGACCAAG 


CTGATTCTCA 


AGAACGCCAG 


CACCCAGGTG 


1740 




GTTCCGTTTT 


TGGCCCTTGG 


TCTGGGCGTC 


GATCACATCT 


TCATAGTGGG 


ACCGAGCATC 


1800 




CTGTTCAGTG 


CCTGCAGCAC 


CGCAGGATCC 


TTCTTTGCGG 


CCGCCTTTAT 


TCCGGTGCCG 


1860 




GCTTTGAAGG 


TATTCTGTCT 


GCAGGCTGCC 


ATCGTAATGT 


GCTCCAATTT 


GGCAGCGGCT 


1920 




CTATTGGTTT 


TTCCGGCCAT 


GATTTCGTTG 


GATCTACGGA 


GACGTACCGC 


CGGCAGGGCG 


1980 




GACATCTTCT 


GCTGCTGTTT 


TCCGGTGTGG 


AAGGAACAGC 


CGAAGGTGGC 


ACCTCCGGTG 


2040 




CTGCCGCTGA 


ACAACAACAA 


CGGGCGCGGG 


GCCCGGCATC 


CGAAGAGCTG 


CAACAACAAC 


2100 




AGGGTGCCGC 


TGCCCGCCCA 


GAATCCTCTG 


CTGGAACAGA 


GGGCAGACAT 


CCCTGGGAGC 


2160 




AGTCACTCAC 


TGGCGTCCTT 


CTCCCTGGCA 


ACCTTCGCCT 


TTCAGCACTA 


CACTCCCTTC 


2220 




CTCATGCGCA 


GCTGGGTGAA 


GTTCCTGACC 


GTTATGGGTT 


TCCTGGCGGC 


CCTCATATCC 


2280 


s 


AGCTTGTATG 


CCTCCACGCG 


CCTTCAGGAT 


GGCCTGGACA 


TTATTGATCT 


GGTGCCCAAG 


2340 




GACAGCAACG 


AGCACAAGTT 


CCTGGATGCT 


CAAACTCGGC 


TCTTTGGCTT 


CTACAGCATG 


2400 


o 


TATGCGGTTA 


CCCAGGGCAA 


CTTTGAATAT 


CCCACCCAGC 


AGCAGTTGCT 


CAGGGACTAC 


2460 




CATGATTCCT 


TTGTGCGGGT 


GCCACATGTG 


ATCAAGAATG 


ATAACGGTGG 


ACTGCCGGAC 


2520 




TTCTGGCTGC 


TGCTCTTCAG 


CGAGTGGCTG 


GGTAATCTGC 


AAAAGATATT 


CGACGAGGAA 


2580 




TACCGCGACG 


GACGGCTGAC 


CAAGGAGTGC 


TGGTTCCCAA 


ACGCCAGCAG 


CGATGCCATC 


2640 




CTGGCCTACA 


AGCTAATCGT 


GCAAACCGGC 


CATGTGGACA 


ACCCCGTGGA 


CAAGGAACTG 


2700 




GTGCTCACCA 


ATCGCCTGGT 


CAACAGCGAT 


GGCATCATCA 


ACCAACGCGC 


CTTCTACAAC 


2760 




TATCTGTCGG 


CATGGGCCAC 


CAACGACGTC 


TTCGCCTACG 


GAGCTTCTCA 


GGGCAAATTG 


2820 




TATCCGGAAC 


CGCGCCAGTA 


TTTTCACCAA 


CCCAACGAGT 


ACGATCTTAA 


GATACCCAAG 


2880 




AGTCTGCCAT 


TGGTCTACGC 


TCAGATGCCC 


TTTTACCTCC 


ACGGACTAAC 


AGATACCTCG 


2940 




CAGATCAAGA 


CCCTGATAGG 


TCATATTCGC 


GACCTGAGCG 


TCAAGTACGA 


GGGCTTCGGC 


3000 




CTGCCCAACT 


ATCCATCGGG 


CATTCCCTTC 


ATCTTCTGGG 


AGCAGTACAT 


GACCCTGCGC 


3060 




TCCTCACTGG 


CCATGATCCT 


GGCCTGCGTG 


CTACTCGCCG 


CCCTGGTGCT 


GGTCTCCCTG 


3120 




CTCCTGCTCT 


CCGTTTGGGC 


CGCCGTTCTC 


GTGATCCTCA 


GCGTTCTGGC 


CTCGCTGGCC 


3180 




CAGATCTTTG 


GGGCCATGAC 


TCTGCTGGGC 


ATCAAACTCT 


CGGCCATTCC 


GGCAGTCATA 


3240 




CTCATCCTCA 


GCGTGGGCAT 


GATGCTGTGC 


TTCAATGTGC 


TGATATCACT 


GGGCTTCATG 


3300 




ACATCCGTTG 


GCAACCGACA 


GCGCCGCGTC 


CAGCTGAGCA 


TGCAGATGTC 


CCTGGGACCA 


3360 



95 



CTTGTCCACG GCATGCTGAC CTCCGGAGTG GCCGTGTTCA TGCTCTCCAC GTCGCCCTTT 
GAGTTTGTGA TCCGGCACTT CTGCTGGCTT CTGCTGGTGG TCTTATGCGT TGGCGCCTGC 
AACAGCCTTT TGGTGTTCCC CATCCTACTG AGCATGGTGG GACCGGAGGC GGAGCTGGTG 
CCGCTGGAGC ATCCAGACCG CATATCCACG CCCTCTCCGC TGCCCGTGCG CAGCAGCAAG 
AGATCGGGCA AATCCTATGT GGTGCAGGGA TCGCGATCCT CGCGAGGCAG CTGCCAGAAG 
TCGCATCACC ACCACCACAA AGACCTTAAT GATCCATCGC TGACGACGAT CACCGAGGAG 
CCGCAGTCGT GGAAGTCCAG CAACTCGTCC ATCCAGATGC CCAATGATTG GACCTACCAG 
CCGCGGGAAC AGCGACCCGC CTCCTACGCG GCCCCGCCCC CCGCCTATCA CAAGGCCGCC 
GCCCAGCAGC ACCACCAGCA TCAGGGCCCG CCCACAACGC CCCCGCCTCC CTTCCCGACG 
GCCTATCCGC CGGAGCTGCA GAGCATCGTG GTGCAGCCGG AGGTGACGGT GGAGACGACG 
CACTCGGACA GCAACACCAC CAAGGTGACG GCCACGGCCA ACATCAAGGT GGAGCTGGCC 
ATGCCCGGCA GGGCGGTGCG CAGCTATAAC TTTACGAGTT AGCACTAGCA CTAGTTCCTG 
TAGCTATTAG GACGTATCTT TAGACTCTAG CCTAAGCCGT AACCCTATTT GTATCTGTAA 
AATCGATTTG TCCAGCGGGT CTGCTGAGGA TTTCGTTCTC ATGGATTCTC ATGGATTCTC 
ATGGATGCTT AAATGGCATG GTAATTGGCA AAATATCAAT TTTTGTGTCT C AAAAAG AT G 
CATTAGCTTA TGGTTTCAAG ATACATTTTT AAAGAGTCCG CCAGATATTT ATATAAAAAA 
AATCCAAAAT CGACGTATCC ATGAAAATTG AAAAGCTAAG CAGACCCGTA TGTATGTATA 
TGTGTATGCA TGTTAGTTAA TTTCCCGAAG TCCGGTATTT ATAGCAGCTG CCTT 
(2) INFORMATION FOR SEQ ID NO : 6 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1285 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



3420 

3480 

3540 

3600 

3660 

3720 

3780 

3840 

3900 

3960 

4020 

4080 

4140 

4200 

4260 

4320 

4380 

4434 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6 : 

Met Asp Arg Asp Ser Leu Pro Arg Val Pro Asp Thr His Gly Asp Val 
1 5 10 15 

Val Asp Glu Lys Leu Phe Ser Asp Leu Tyr lie Arg Thr Ser Trp Val 
20 25 30 

Asp Ala Gin Val Ala Leu Asp Gin He Asp Lys Gly Lys Ala Arg Gly 
35 40 45 

Ser Arg Thr Ala He Tyr Leu Arg Ser Val Phe Gin Ser His Leu Glu 
50 " 60 



Thr Leu Gly Ser Ser Val Gin Lys His Ala Gly Lys Val Leu Phe Val 
65 ™ 75 

Ala lie Leu Val Leu Ser Thr Phe Cys Val Gly Leu Lys Ser Ala Gin 
85 90 



He His Ser Lys va 
100 



1 His Gin Leu Trp He Gin Glu Gly Gly Arg Leu 



105 



110 



Glu Ala Glu Leu Ala Tyr Thr Gin Lys Thr lie Gly Glu Asp Glu Ser 

120 x " 



115 



Thr Thr His Asp Pro Asn Ala Ser 



Ala Thr His Gin Leu Leu lie Gin 

130 135 140 

val Leu His Pro Gin Ala Leu Leu Ala His Leu Glu Val Leu Val Lys 
145 150 i55 

Ala Thr Ala Val Lys Val His Leu Tyr Asp Thr Glu Trp Gly Leu Arg 
165 I 70 1 D 

Ser Phe Glu Gly He Tyr Tyr 



Asp Met Cys Asn Met Pro Ser Thr Pro 
180 i85 



190 



He Glu Gin He Leu Arg His Leu He Pro Cys Ser lie He Thr Pro 

200 



195 



Leu Asp Cys Phe Trp Glu Gly Se 

215 220 



r Gin Leu Leu Gly Pro Glu Ser Ala 
220 

val val He Pro Gly Leu Asn Gin Arg Leu Leu Trp Thr Thr Leu Asn 
225 "0 235 

Pro Ala Ser Val Met Gin Tyr Met Lys Gin Lys Met Ser Glu Glu Lys 



210 



245 



250 



He Ser 



Phe Asp Phe Glu Thr Val Glu Gin Tyr Met Lys Arg Ala Ala 



260 265 
lie Gly Ser Gly Tyr Met Glu Lys Pro Cys Leu Asn Pro Leu Asn Pro 



275 280 
Asn Cys Pro Asp Thr Ala Pro Asn Lys Asn Ser Thr Gin Pro Pro Asp 



290 



295 



Val Gly Ala He Leu Ser Gly Gly Cys Tyr Gly Tyr Ala Ala Lys His 

305 3X0 

Met His Tr P Pro Glu Glu Leu He Val Gly Gly Arg Lys Arg Asn Arg 
325 330 

Ser Gly His Leu Arg Lys Ala Gin Ala Leu Gin Ser Val Val Gin Leu 



340 



345 



Met Thr Glu Lys Glu Met Tyr Asp Gin Trp Gin Asp Asn Tyr Lys Val 
355 360 365 

His His Leu Gly Trp Thr Gin Glu Lys Ala Ala Glu Val Leu Asn Ala 
370 375 380 

Trp Gin Arg Asn Phe Ser Arg Glu Val Glu Gin Leu Leu Arg Lys Gin 
385 390 395 400 



f7 





Ser Arg lie Ala Thr Asn Tyr Asp lie Tyr Val Phe Ser Ser Ala Ala 
405 410 415 

Leu Asp Asp lie Leu Ala Lys Phe Ser His Pro Ser Ala Leu Ser lie 
420 425 430 

Val lie Gly Val Ala Val Thr Val Leu Tyr Ala Phe Cys Thr Leu Leu 
435 440 445 

Arg Trp Arg Asp Pro Val Arg Gly Gin Ser Ser Val Gly Val Ala Gly 
450 455 460 

Val Leu Leu Met Cys Phe Ser Thr Ala Ala Gly Leu Gly Leu Ser Ala 
465 470 475 480 

Leu Leu Gly lie Val Phe Asn Ala Leu Thr Ala Ala Tyr Ala Glu Ser 
485 490 495 

Asn Arg Arg Glu Gin Thr Lys Leu lie Leu Lys Asn Ala Ser Thr Gin 
500 505 510 

Val Val Pro Phe Leu Ala Leu Gly Leu Gly Val Asp His lie Phe lie 
515 520 525 

Val Gly Pro Ser lie Leu Phe Ser Ala Cys Ser Thr Ala Gly Ser Phe 
530 535 540 

Phe Ala Ala Ala Phe lie Pro Val Pro Ala Leu Lys Val Phe Cys Leu 
545 550 555 560 

Gin Ala Ala lie Val Met Cys Ser Asn Leu Ala Ala Ala Leu Leu Val 
565 570 575 

Phe Pro Ala Met lie Ser Leu Asp Leu Arg Arg Arg Thr Ala Gly Arg 
580 585 590 

Ala Asp lie Phe Cys Cys Cys Phe Pro Val Trp Lys Glu Gin Pro Lys 
595 600 605 

Val Ala Pro Pro Val Leu Pro Leu Asn Asn Asn Asn Gly Arg Gly Ala 
610 615 620 

Arg His Pro Lys Ser Cys Asn Asn Asn Arg Val Pro Leu Pro Ala Gin 
625 630 635 640 

Asn Pro Leu Leu Glu Gin Arg Ala Asp lie Pro Gly Ser Ser His Ser 
645 650 655 

Leu Ala Ser Phe Ser Leu Ala Thr Phe Ala Phe Gin His Tyr Thr Pro 
660 665 670 

Phe Leu Met Arg Ser Trp Val Lys Phe Leu Thr Val Met Gly Phe Leu 

675 680 685 

Ala Ala Leu lie Ser Ser Leu Tyr Ala Ser Thr Arg Leu Gin Asp Gly 
690 695 700 

Leu Asp lie lie Asp Leu Val Pro Lys Asp Ser Asn Glu His Lys Phe 

705 710 715 720 

Leu Asp Ala Gin Thr Arg Leu Phe Gly Phe Tyr Ser Met Tyr Ala Val 



725 



730 



735 



Thr Gin Gly Asn Phe Glu Tyr Pro Thr Gin Gin Gin Leu Leu Arg Asp 



740 745 7 50 

Tyr His Asp Ser Phe Arg Val Pro His Val He Lys Asn Asp Asn Gly 
755 760 765 

Gly Leu Pro Asp Phe Trp Leu Leu Leu Phe Ser Glu Trp Leu Gly Asn 
770 775 780 

Leu Gin Lys He Phe Asp Glu Glu Tyr Arg Asp Gly Arg Leu Thr Lys 
785 790 795 800 

Glu Cys Trp Phe Pro Asn Ala Ser Ser Asp Ala He Leu Ala Tyr Lys 
805 810 815 

Leu He Val Gin Thr Gly His Val Asp Asn Pro Val Asp Lys Glu Leu 
820 825 830 

Val Leu Thr Asn Arg Leu Val Asn Ser Asp Gly He He Asn Gin Arg 
835 840 845 

Ala Phe Tyr Asn Tyr Leu Ser Ala Trp Ala Thr Asn Asp Val Phe Ala 
850 855 860 

Tvr Gly Ala Ser Gin Gly Lys Leu Tyr Pro Glu Pro Arg Gin Tyr Phe 
865 870 875 880 

His Gin Pro Asn Glu Tyr Asp Leu Lys He Pro Lys Ser Leu Pro Leu 
885 890 895 

Val Tyr Ala Gin Met Pro Phe Tyr Leu His Gly Leu Thr Asp Thr Ser 
900 905 910 

Gin He Lys Thr Leu He Gly His He Arg Asp Leu Ser Val Lys Tyr 
915 920 925 

Glu Gly Phe Gly Leu Pro Asn Tyr Pro Ser Gly He Pro Phe He Phe 
930 935 940 

Trp Glu Gin Tyr Met Thr Leu Arg Ser Ser Leu Ala Met He Leu Ala 
945 950 955 960 

Cys Val Leu Leu Ala Ala Leu Val Leu Val Ser Leu Leu Leu Leu Ser 
965 970 975 

Val Trp Ala Ala Val Leu Val He Leu Ser Val Leu Ala Ser Leu Ala 
980 985 990 

Gin He Phe Gly Ala Met Thr Leu Leu Gly He Lys Leu Ser Ala He 
995 1000 1005 

Pro Ala Val He Leu He Leu Ser Val Gly Met Met Leu Cys Phe Asn 
1010 1015 1020 

Val Leu He Ser Leu Gly Phe Met Thr Ser Val Gly Asn Arg Gin Arg 
1025 1030 1035 1040 

Arg Val Gin Leu Ser Met Gin Met Ser Leu Gly Pro Leu Val His Gly 
1045 1050 1055 

Met Leu Thr Ser Gly Val Ala Val Phe Met Leu Ser Thr Ser Pro Phe 
1060 1065 1070 

Glu Phe Val He Arg His Phe Cys Trp Leu Leu Leu Val Val Leu Cys 



1075 1080 1085 

Val Gly Ala Cys Asn Ser Leu Leu Val Phe Pro He Leu Leu Ser Met 
1090 1095 1100 

Val Gly Pro Glu Ala Glu Leu Val Pro Leu Glu His Pro Asp Arg lie 
1105 mO 1115 

Ser Thr Pro Ser Pro Leu Pro Val Arg Ser Ser Lys Arg Ser Gly Lys 
1125 H30 II 35 

Ser Tyr Val Val Gin Gly Ser Arg Ser Ser Arg Gly Ser Cys Gin Lys 
1140 II 45 

Ser His His His His His Lys Asp Leu Asn Asp Pro Ser Leu Thr Thr 
1155 H60 US 5 

lie Thr Glu Glu Pro Gin Ser Trp Lys Ser Ser Asn Ser Ser lie Gin 
1170 H75 H80 

Met Pro Asn Asp Trp Thr Tyr Gin Pro Arg Glu Gin Arg Pro Ala Ser 
1185 1190 1195 

Tyr Ala Ala Pro Pro Pro Ala Tyr His Lys Ala Ala Ala Gin Gin His 



1205 



His Gin His Gin Gly Pro Pro Thr Thr Pro Pro Pro Pro Phe Pro Thr 
12 20 1225 1230 

Ala Tyr Pro Pro Glu Leu Gin Ser lie Val Val Gin Pro Glu Val Thr 
12 35 1240 1245 

Val Glu Thr Thr His Ser Asp Ser Asn Thr Thr Lys Val Thr Ala Thr 
1250 1255 1260 



;er 



Ala Asn lie Lys Val Glu Leu Ala Met Pro Gly Arg Ala Val Arg S 
1265 1270 1275 1280 

Tyr Asn Phe Thr Ser 
1285 

(2) INFORMATION FOR SEQ ID NO : 7 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 345 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7 : 
AAGGTCCATC AGCTTTGGAT ACAGGAAGGT GGTTCGCTCG AGCATGAGCT AGCCTACACG 
CAGAAATCGC TCGGCGAGAT GGACTCCTCC ACGCACCAGC TGCTAATCCA AACNCCCAAA 
GATATGGACG CCTCGATACT GCACCCGAAC GCGCTACTGA CGCACCTGGA CGTGGTGAAG 
AAAGCGATCT CGGTGACGGT GCACATGTAC GACATCACGT GGAGNCTCAA GGACATGTGC 



to 



TACTCGCCCA GCATACCGAG NTTCGATACG CACTTTATCG AGCAGATCTT CGAGAACATC 
ATACCGTGCG CGATCATCAC GCCGCTGGAT TGCTTTTGGG AGGGA 
(2) INFORMATION FOR SEQ ID NO : 8 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 115 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



300 
345 



(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 8 : 
Lys Val His Gin Leu Trp He Gin Glu Gly Gly Ser Leu Glu His Glu 

i c: 10 



Leu Ala Tyr Thr Gin Lys Ser Leu Gly Glu Met Asp Ser Ser Thr His 
20 25 30 

Gin Leu Leu He Gin Thr Pro Lys Asp Met Asp Ala Ser He Leu His 
35 4 <> 45 

Pro Asn Ala Leu Leu Thr His Leu Asp Val Val Lys Lys Ala He Ser 
50 " 60 

Val Thr Val His Met Tyr Asp He Thr Trp Xaa Leu Lys Asp Met Cys 
65 70 75 

Tyr Ser Pro Ser lie Pro Xaa Phe Asp Thr His Phe He Glu Gin lie 



85 



90 



Phe Glu Asn He He Pro Cys Ala He He Thr Pro Leu Asp Cys Phe 
100 105 110 



Trp Glu Gly 
115 

(2) INFORMATION FOR SEQ ID NO : 9 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 5187 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
GOGTCTGTCA CCCGGAGCCG GAGTCCCCGG CGGCCAGCAG CGTCCTCGCG AGCCGAGCGC 60 
CCAGGCGCGC CCGGAGCCCG CGGCGGCGGC GGCAACATGG CCTCGGCTGG TAACGCCGCC 120 






GGGGCCCTGG 


GCAGGCAGGC 


CGGCGGCGGG 


AGGCGCAGAC 


GGACCGGGGG 


ACCGCACCGC 


180 




GCCGCGCCGG 


ACCGGGACTA 


TCTGCACCGG 


CCCAGCTACT 


GCGACGCCGC 


CTTCGCTCTG 


240 




GAGCAGATTT 


CCAAGGGGAA 


GGCTACTGGC 


CGGAAAGCGC 


CGCTGTGGCT 


GAGAGCGAAG 


300 




TTTCAGAGAC 


TCTTATTTAA 


ACTGGGTTGT 


TACATTCAAA 


AGAACTGCGG 


CAAGTTTTTG 


360 




GTTGTGGGTC 


TCCTCATATT 


TGGGGCCTTC 


GCTGTGGGAT 


TAAAGGCAGC 


TAATCTCGAG 


420 




ACCAACGTGG 


AGGAGCTGTG 


GGTGGAAGTT 


GGTGGACGAG 


TGAGTCGAGA 


ATTAAATTAT 


480 




ACCCGTCAGA 


AGATAGGAGA 


AGAGGCTATG 


TTTAATCCTC 


AACTCATGAT 


ACAGACTCCA 


540 




AAAGAAGAAG 


GCGCTAATGT 


TCTGACCACA 


GAGGCTCTCC 


TGCAACACCT 


GGACTCAGCA 


600 




CTCCAGGCCA 


GTCGTGTGCA 


CGTCTACATG 


TATAACAGGC 


AATGGAAGTT 


GGAACATTTG 


660 




TGCTACAAAT 


CAGGGGAACT 


TATCACGGAG 


ACAGGTTACA 


TGGATCAGAT 


AATAGAATAC 


720 




CTTTACCCTT 


GCTTAATCAT 


TACACCTTTG 


GACTGCTTCT 


GGGAAGGGGC 


AAAGCTACAG 


780 




TCPGGGACAG 


CATACCTCCT 


AGGTAAGCCT 


CCTTTACGGT 


GGACAAACTT 


TGACCCCTTG 


840 




GAATTCCTAG 


AAGAGTTAAA 


GAAAATAAAC 


TACCAAGTGG 


ACAGCTGGGA 


GGAAATGCTG 


900 


m 


AATAAAGCCG 


AAGTTGGCCA 


TGGGTACATG 


GACCGGCCTT 


GCCTCAACCC 


AGCCGACCCA 


960 




GATTGCCCTG 


CCACAGCCCC 


TAACAAAAAT 


TCAACCAAAC 


CTCTTGATGT 


GGCCCTTGTT 


1020 


%J 


TTGAATGGTG 


GATGTCAAGG 


TTTATCCAGG 


AAGTATATGC 


ATTGGCAGGA 


GGAGTTGATT 


1080 




GTGGGTGGTA 


CCGTCAAGAA 


TGCCACTGGA 


AAACTTGTCA 


GCGCTCACGC 


CCTGCAAACC 


1140 


nj 


ATGTTCCAGT 


TAATGACTCC 


CAAGCAAATG 


TATGAACACT 


TCAGGGGCTA 


CGACTATGTC 


1200 


Is 


TCTCACATCA 


ACTGGAATGA 


AGACAGGGCA 


GCCGCCATCC 


TGGAGGCCTG 


GCAGAGGACT 


1260 




TACGTGGAGG 


TGGTTCATCA 


AAGTGTCGCC 


CCAAACTCCA 


CTCAAAAGGT 


GCTTCCCTTC 


1320 




ACAACCACGA 


CCCTGGACGA 


CATCCTAAAA 


TCCTTCTCTG 


ATGTCAGTGT 


CATCCGAGTG 


1380 




GCCAGCGGCT 


ACCTACTGAT 


GCTTGCCTAT 


GCCTGTTTAA 


CCATGCTGCG 


CTGGGACTGC 


1440 




TCCAAGTCCC 


AGGGTGCCGT 


GGGGCTGGCT 


GGCGTCCTGT 


TGGTTGCGCT 


GTCAGTGGCT 


1500 




GCAGGATTGG 


GCCTCTGCTC 


CTTGATTGGC 


ATTTCTTTTA 


ATGCTGCGAC 


AACTCAGGTT 


1560 




TTGCCGTTTC 


TTGCTCTTGG 


TGTTGGTGTG 


GATGATGTCT 


TCCTCCTGGC 


CCATGCATTC 


1620 




AGTGAAACAG 


GACAGAATAA 


GAGGATTCCA 


TTTGAGGACA 


GGACTGGGGA 


GTGCCTCAAG 


1680 




CGCACCGGAG 


CCAGCGTGGC 


CCTCA'CCTCC 


ATCAGCAATG 


TCACCGCCTT 


CTTCATGGCC 


1740 




GCATTGATCC 


CTATCCCTGC 


CCTGCGAGCG 


TTCTCCCTCC 


AGGCTGCTGT 


GGTGGTGGTA 


1800 




TTCAATTTTG 


CTATGGTTCT 


GCTCATTTTT 


CCTGCAATTC 


TCAGCATGGA 


TTTATACAGA 


1860 




CGTGAGGACA 


GAAGATTGGA 


TATTTTCTGC 


TGTTTCACAA 


GCCCCTGTGT 


CAGCAGGGTG 


1920 




ATTCAAGTTG 


AGCCACAGGC 


CT AC AC AG AG 


CCTCACAGTA 


ACACCCGGTA 


CAGCCCCCCA 


1980 




CCCCCATACA 


CCAGCCACAG 


CTTCGCCCAC 


GAAACCCATA 


TCACTATGCA 


GTCCACCGTT 


2040 



62. 



CAGCTCCGCA CAGAGTATGA CCCTCACACG CACGTGTACT ACACCACCGC CGAGCCACGC 210 0 

TCTGAGATCT CTGTACAGCC TGTTACCGTC ACCCAGGACA ACCTCAGCTG TCAGAGTCCC 2160 

GAGAGCACCA GCTCTACCAG GGACCTGCTC TCCCAGTTCT CAGACTCCAG CCTCCACTGC 2220 

CTCGAGCCCC CCTGCACCAA GTGGACACTC TCTTCGTTTG CAGAGAAGCA CTATGCTCCT 22 80 

TTCCTCCTGA AACCCAAAGC CAAGGTTGTG GTAATCCTTC TTTTCCTGGG CTTGCTGGGG 2 34 0 

GTCAGCCTTT ATGGGACCAC CCGAGTGAGA GACGGGCTGG ACCTCACGGA CATTGTTCCC 2 4 00 

CGGGAAACCA GAGAATATGA CTTCATAGCT GCCCAGTTCA AGTACTTCTC TTTCTACAAC 2 4 60 

ATGTATATAG TCACCCAGAA AGCAGACTAC CCGAATATCC AGCACCTACT TTACGACCTT 2 52 0 

CATAAGAGTT TCAGCAATGT GAAGTATGTC ATGCTGGAGG AGAACAAGCA ACTTCCCCAA 2 580 

ATGTGGCTGC ACTACTTTAG AGACTGGCTT CAAGGACTTC AGGATGCATT TGACAGTGAC 2 640 

TGGGAAACTG GGAGGATCAT GCCAAACAAT TATAAAAATG GATCAGATGA CGGGGTCCTC 2700 

GCTTACAAAC TCCTGGTGCA GACTGGCAGC CGAGACAAGC CCATCGACAT TAGTCAGTTG 27 60 

ACTAAACAGC GTCTGGTAGA CGCAGATGGC ATCATTAATC CGAGCGCTTT CTACATCTAC 2 82 0 

CTGACCGCTT GGGTCAGCAA CGACCCTGTA GCTTACGCTG CCTCCCAGGC CAACATCCGG 2 8 80 

CCTCACCGGC CGGAGTGGGT CCATGACAAA GCCGACTACA TGCCAGAGAC CAGGCTGAGA 2 940 

ATCCCAGCAG CAGAGCCCAT CGAGTACGCT CAGTTCCCTT TCTACCTCAA CGGCCTACGA 30 0 0 

GACACCTCAG ACTTTGTGGA AGCCATAGAA AAAGTGAGAG TCATCTGTAA CAACTATACG 30 60 

AGCCTGGGAC TGTCCAGCTA CCCCAATGGC TACCCCTTCC TGTTCTGGGA GCAATACATC 3120 

AGCCTGCGCC ACTGGCTGCT GCTATCCATC AGCGTGGTGC TGGCCTGCAC GTTTCTAGTG 3180 

TGCGCAGTCT TCCTCCTGAA CCCCTGGACG GCCGGGATCA TTGTCATGGT CCTGGCTCTG 32 40 

ATGACCGTTG AGCTCTTTGG CATGATGGGC CTCATTGGGA TCAAGCTGAG TGCTGTGCCT 330 0 

GTGGTCATCC TGATTGCATC TGTTGGCATC GGAGTGGAGT TCACCGTCCA CGTGGCTTTG 3360 

GCCTTTCTGA CAGCCATTGG GGACAAGAAC CACAGGGCTA TGCTCGCTCT GGAACACATG 3420 

TTTGCTCCCG TTCTGGACGG TGCTGTGTCC ACTCTGCTGG GTGTACTGAT GCTTGCAGGG 34 8 0 

TCCGAATTTG ATTTCATTGT CAGATACTTC TTTGCCGTCC TGGCCATTCT CACCGTCTTG 35 4 0 

GGGGTTCTCA ATGGACTGGT TCTGCTGCCT GTCCTCTTAT CCTTCTTTGG ACCGTGTCCT 360 0 

GAGGTGTCTC CAGCCAATGG CCTAAACCGA CTGCCCACTC CTTCGCCTGA GCCGCCTCCA 36 60 

AGTGTCGTCC GGTTTGCCGT GCCTCCTGGT CACACGAACA ATGGGTCTGA TTCCTCCGAC 3720 

TCGGAGTACA GCTCTCAGAC CACGGTGTCT GGCATCAGTG AGGAGCTCAG GCAATACGAA 37 80 

GCACAGCAGG GTGCCGGAGG CCCTGCCCAC CAAGTGATTG TGGAAGCCAC AGAAAACCCT 38 4 0 

GTCTTTGCCC GGTCCACTGT GGTCCATCCG GACTCCAGAC ATCAGCCTCC CTTGACCCCT 3900 
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CGGCAACAGC CCCACCTGGA CTCTGGCTCC TTGTCCCCTG GACGGCAAGG CCAGCAGCCT 
CGAAGGGATC CCCCTAGAGA AGGCTTGCGG CCACCCCCCT ACAGACCGCG CAGAGACGCT 
TTTGAAATTT CTACTGAAGG GCATTCTGGC CCTAGCAATA GGGACCGCTC AGGGCCCCGT 
GGGGCCCGTT CTCACAACCC TCGGAACCCA ACGTCCACCG • CCATGGGCAG CTCTGTGCCC 
AGCTACTGCC AGCCCATCAC CACTGTGACG GCTTCTGCTT CGGTGACTGT TGCTGTGCAT 
CCCCCGCCTG GACCTGGGCG CAACCCCCGA GGGGGGCCCT GTCCAGGCTA TGAGAGCTAC 
CCTGAGACTG ATCACGGGGT ATTTGAGGAT CCTCATGTGC CTTTTCATGT CAGGTGTGAG 
AGGAGGGACT CAAAGGTGGA GGTCATAGAG CTACAGGACG TGGAATGTGA GGAGAGGCCG 
TGGGGGAGCA GCTCCAACTG AGGGTAATTA AAATCTGAAG CAAAGAGGCC AAAGATTGGA 
AAGCCCCGCC CCCACCTCTT TCCAGAACTG CTTGAAGAGA ACTGCTTGGA ATTATGGGAA 
GGCAGTTCAT TGTTACTGTA ACTGATTGTA TTATTKKGTG AAATATTTCT ATAAATATTT 
Q AARAGGTGTA CACATGTAAT AT AC AT GG AA ATGCTGTACA GTCTATTTCC TGGGGCCTCT 
I CCACTCCTGC CCCAGAGTGG GGAGACCACA GGGGCCCTTT CCCCTGTGTA CATTGGTCTC 
g TGTGCCACAA CCAAGCTTAA CTTAGTTTTA AAAAAAATCT CCCAGCATAT GTCGCTGCTG 
| CTTAAATATT GTATAATTTA CTTGTATAAT TCTATGCAAA TATTGCTTAT GTAATAGGAT 
□ TATTTGTAAA GGTTTCTGTT TAAAATATTT TAAATTTGCA TATCACAACC CTGTGGTAGG 
Q ATGAATTGTT ACTGTTAACT TTTGAACACG CTATGCGTGG TAATTGTTTA ACGAGCAGAC 
| ATGAAGAAAA CAGGTTAATC CCAGTGGCTT CTCTAGGGGT AGTTGTATAT GGTTCGCATG 
£ GGTGGATGTG TGTGTGCATG TGACTTTCCA ATGTACTGTA TTGTGGTTTG TTGTTGTTGT 
5 TGCTGTTGTT GTTCATTTTG GTGTTTTTGG TTGCTTTGTA TGATCTTAGC TCTGGCCTAG 
GTGGGCTGGG AAGGTCCAGG TCTTTTTCTG TCGTGATGCT GGTGGAAAGG TGACCCCAAT 
CATCTGTCCT ATTCTCTGGG ACTATTC 
(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1434 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 

Cii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

Met Ala Ser Ala Gly Asn Ala Ala Gly Ala Leu Gly Arg Gin Ala Gly 

! 5 10 10 

Gly Gly Arg Arg Arg Arg Thr Gly Gly Pro His Arg Ala Ala Pro Asp 
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25 30 



Arg Asp Tyr Leu His Are, Pro Ser Tyr Cys Asp Ala Ala Phe Ala Leu 
35 40 



45 

Ala Thr Gly Arg Lys Ala Pro Leu Trp 
55 60 



Glu Gin lie Ser Lys Gly Lys 
50 

Leu Arg Ala Lys Phe Gin Arg Leu Leu Phe Lys Leu Gly Cys Tyr lie 
65 70 " 



80 



Gin Lys Asn Cys Gly Lys Phe Leu Val Val Gly Leu Leu He Phe Gly 
85 90 

Ala Phe Ala Val Gly Leu Lys Ala Ala Asn Leu Glu Thr Asn Val Glu 

100 105 110 

Glu Leu Trp Val Glu Val Gly Gly Arg Val Ser Arg Glu Leu Asn Tyr 

Thr Arg Gin Lys He Gly Glu Glu Ala Met Phe Asn Pro Gin Leu Met 
130 135 140 

lie Gin Thr Pro Lys Glu Glu Gly Ala Asn Val Leu Thr Thr Glu Ala 
145 150 I 55 

Leu Leu Gin His Leu Asp Ser Ala Leu Gin Ala Ser Arg Val His Val 
165 170 175 

Tyr Met Tyr Asn Arg Gin Trp Lys Leu Glu His Leu Cys Tyr Lys Ser 
180 185 x9 ° 

Gly Glu Leu He Thr Glu Thr Gly Tyr Met Asp Gin lie He Glu Tyr 
195 200 2 ° 5 

Leu Tyr Pro Cys Leu He He Thr Pro Leu Asp Cys Phe Trp Glu Gly 
210 215 220 

Ala Lys Leu Gin Ser Gly Thr Ala Tyr Leu Leu Gly Lys Pro Pro Leu 

230 235 



225 

Glu Phe Leu Glu Glu Leu Lys Lys 
245 " 250 



Arg Trp Thr Asn Phe Asp Pro Leu 



He Asn Tyr Gin Val Asp Ser Trp Glu Glu Met Leu Asn Lys Ala Glu 
260 265 270 

Val Gly His Gly Tyr Met Asp Arg Pro Cys Leu Asn Pro Ala Asp Pro 
275 2 80 285 

Asp Cys Pro Ala Thr Ala Pro Asn Lys Asn Ser Thr Lys Pro Leu Asp 
290 295 300 

Val Ala Leu Val Leu Asn Gly Gly Cys Gin Gly Leu Ser Arg Lys Tyr 
305 310 315 

Met His Trp Gin Glu Glu Leu He Val Gly Gly Thr Val Lys Asn Ala 
325 330 335 

Thr Glv Lys Leu Val Ser Ala His Ala Leu Gin Thr Met Phe Gin Leu 
340 345 350 

Met Thr Pro Lys Gin Met Tyr Glu His Phe Arg Gly Tyr Asp Tyr Val 
355 360 365 



Ser His He Asn Trp Asn Glu Asp Arg 
370 



Ala Ala Ala He Leu Glu Ala 



375 380 



Trp Gin Arg Thr Tyr Val Glu Val V.l His Gin Ser Val Ala Pro Asn 
385 390 395 

Ser Thr Gin Lys Val Leu Pro Phe Thr Thr Thr Thr Leu Asp Asp He 
405 410 

Leu Lys Ser Phe Ser Asp Val Ser Val He Arg Val Ala Ser Gly Tyr 
420 425 

Leu Leu Met Leu Ala Tyr Ala Cys Leu Thr Met Leu Arg Trp Asp Cys 
435 440 445 

Ser Lys Ser Gin Gly Ala Val Gly Leu Ala Gly Val Leu Leu Val Ala 



450 4 &5 460 

Leu Ser Val Ala Ala Gly Leu Gly Leu Cys Ser Leu He Gly He Ser 
465 4 ™ 475 

Phe Asn Ala Ala Thr Thr Gin Val Leu Pro Phe Leu Ala Leu Gly Val 
485 4 9° 49 

Gly val Asp Asp val Phe Leu Leu Ala His Ala Phe Ser Glu Thr Gly 



500 



Gin Asn Lys Arg He Pro Phe Glu Asp Arg Thr Gly Glu Cys Leu Lys 
515 520 525 

Arg Thr Gly Ala Ser Val Ala Leu Thr Ser He Ser Asn Val Thr Ala 



530 



535 540 



Phe Phe Met Ala Ala Leu He Pro He Pro Ala Leu Arg Ala Phe Ser 
545 550 555 560 

Leu Gin Ala Ala Val Val Val Val Phe Asn Phe Ala Met Val Leu Leu 
565 570 575 

He Phe Pro Ala He Leu Ser Met Asp Leu Tyr Arg Arg Glu Asp Arg 



580 



585 590 



Arg Leu Asp He Phe Cys Cys Phe Thr Ser Pro Cys Val Ser Arg Val 
595 600 605 

He Gin Val Glu Pro Gin Ala Tyr Thr Glu Pro His Ser Asn Thr Arg 
610 615 620 

Tyr Ser Pro Pro Pro Pro Tyr Thr Ser His Ser Phe Ala His Glu Thr 
625 630 635 M " 

His He Thr Met Gin Ser Thr Val Gin Leu Arg Thr Glu Tyr Asp Pro 
645 650 655 

His Thr His Val Tyr Tyr Thr Thr Ala Glu Pro Arg Ser Glu He Ser 
. 660 665 670 

Val Gin Pro Val Thr Val Thr Gin Asp Asn Leu Ser Cys Gin Ser Pro 
675 680 685 

Glu Ser Thr Ser Ser Thr Arg Asp Leu Leu Ser Gin Phe Ser Asp Ser 
690 695 700 



fob ^ 



Ser Leu His Cys Leu Glu Pro Pro Cys Thr Lys Trp Thr Leu Ser Ser 
705 710 715 720 

Phe Ala Glu Lys His Tyr Ala Pro Phe Leu Leu Lys Pro Lys Ala Lys 
725 730 735 

Val Val Val lie Leu Leu Phe Leu Gly Leu Leu Gly. Val Ser Leu Tyr 

740 745 750 

Gly Thr Thr Arg Val Arg Asp Gly Leu Asp Leu Thr Asp lie Val Pro 
755 760 765 

Arg Glu Thr Arg Glu Tyr Asp Phe He Ala Ala Gin Phe Lys Tyr Phe 
770 ' 775 780 

Ser Phe Tyr Asn Met Tyr He Val Thr Gin Lys Ala Asp Tyr Pro Asn 
785 790 795 800 

He Gin His Leu Leu Tyr Asp Leu His Lys Ser Phe Ser Asn Val Lys 
805 810 815 

Tvr val Met Leu Glu Glu Asn Lys Gin Leu Pro Gin Met Trp Leu His 
820 825 830 

Tyr Phe Arg Asp Trp Leu Gin Gly Leu Gin Asp Ala Phe Asp Ser Asp 
835 840 845 

Trp Glu Thr Gly Arg He Met Pro Asn Asn Tyr Lys Asn Gly Ser Asp 
850 855 860 

Asp Gly Val Leu Ala Tyr Lys Leu Leu Val Gin Thr Gly Ser Arg Asp 
865 870 875 880 

Lys Pro He Asp He Ser Gin Leu Thr Lys Gin Arg Leu Val Asp Ala 
885 890 895 

Asp Gly He He Asn Pro Ser Ala Phe Tyr He Tyr Leu Thr Ala Trp 
900 905 910 

val Ser Asn Asp Pro Val Ala Tyr Ala Ala Ser Gin Ala Asn He Arg 
915 920 925 

Pro His Arg Pro Glu Trp Val His Asp Lys Ala Asp Tyr Met Pro Glu 
930 935 940 

Thr Arg Leu Arg He Pro Ala Ala Glu Pro He Glu Tyr Ala Gin Phe 
945 950 955 960 

Pro Phe Tyr Leu Asn Gly Leu Arg Asp Thr Ser Asp Phe Val Glu Ala 
965 970 975 

He Glu Lys Val Arg Val He Cys Asn Asn Tyr Thr Ser Leu Gly Leu 
980 985 990 

Ser Ser Tyr Pro Asn Gly Tyr Pro Phe Leu Phe Trp Glu Gin Tyr He 
995 1000 1005 

Ser Leu Arg His Trp Leu Leu Leu Ser He Ser Val Val Leu Ala Cys 
1010 1015 1020 

Thr Phe Leu Val Cys Ala Val Phe Leu Leu Asn Pro Trp Thr Ala Gly 
10 25 1030 1035 1040 
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lie lie Val Met Val Leu Ala Leu Met Thr Val Glu Leu Phe Gly Met 
1045 1050 1055 

Met Gly Leu lie Gly He Lys Leu Ser Ala Val Pro Val Val He Leu 
1060 1065 1070 

lie Ala Ser Val Gly He Gly Val Glu Phe Thr Val His Val Ala Leu 
1075 1080 1085 

Ala Phe Leu Thr Ala He Gly Asp Lys Asn His Arg Ala Met Leu Ala 
1090 1095 1100 

Leu Glu His Met Phe Ala Pro Val Leu Asp Gly Ala Val Ser Thr Leu 
1105 1110 HIS 1120 

Leu Gly Val Leu Met Leu Ala Gly Ser Glu Phe Asp Phe He Val Arg 
1125 H30 H35 

Tyr Phe Phe Ala Val Leu Ala He Leu Thr Val Leu Gly Val Leu Asn 
1140 H45 H50 

Gly Leu Val Leu Leu Pro Val Leu Leu Ser Phe Phe Gly Pro Cys Pre 
1155 H60. H65 

Glu Val Ser Pro Ala Asn Gly Leu Asn Arg Leu Pro Thr Pro Ser Pro 
1170 H75 H80 

Glu Pro Pro Pro Ser Val Val Arg Phe Ala Val Pro Pro Gly His Thr 
1185 1190 H95 1200 

Asn Asn Gly Ser Asp Ser Ser Asp Ser Glu Tyr Ser Ser Gin Thr Thr 
1205 1210 1215 

Val Ser Gly He Ser Glu Glu Leu Arg Gin Tyr Glu Ala Gin Gin Gly 
1220 1225 1230 

Ala Gly Gly Pro Ala His Gin Val He Val Glu Ala Thr Glu Asn Pro 
1235 1240 1245 

Val Phe Ala Arg Ser Thr Val Val His Pro Asp Ser Arg His Gin Pro 
1250 1255 1260 

Pro Leu Thr Pro Arg Gin Gin Pro His Leu Asp Ser Gly Ser Leu Ser 
1265 1270 1275 1280 

Pro Gly Arg Gin Gly Gin Gin Pro Arg Arg Asp Pro Pro Arg Glu Gly 
1285 1290 129-5 

Leu Arg Pro Pro Pro Tyr Arg Pro Arg Arg Asp Ala Phe Glu He Ser 
1300 1305 1310 

Thr Glu Gly His Ser Gly Pro Ser Asn Arg Asp Arg Ser Gly Pro Arg 
1315 1320 1325 

Gly Ala Arg Ser His Asn Pro Arg Asn Pro Thr Ser Thr Ala Met Gly 
1330 1335 1340 

Ser Ser Val Pro Ser Tyr Cys Gin Pro He Thr Thr Val Thr Ala Ser 
1345 1350 1355 13cC 

Ala Ser Val Thr Val Ala Val His Pro Pro Pro Gly Pro Gly Arg Asn 
1365 1370 1375 
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Pro Arg Gly Gly Pro Cys Pro Gly Tyr Glu Ser Tyr Pro Glu Thr Asp 
1380 1385 1390 

His Gly Val Phe Glu Asp Pro His Val Pro Phe His Val Arg Cys Glu 
1395 1400 1405 

Arg Arg Asp Ser Lys Val Glu Val He Glu Leu Gin Asp Val Glu Cys 
1410 1415 1420 

Glu Glu Arg Pro Trp Gly Ser Ser Ser Asn 
1425 1430 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:ll: 

He He Thr Pro Leu Asp Cys Phe Trp Glu Gly 
1 " 5 10 

(2) INFORMATION FOR SEQ ID NO: 12: 

(l) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

Leu He Val Gly Gly 
1 5 

(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:13: 
Pro Phe Phe Trp Glu Gin Tyr 




1 5 
(2) INFORMATION FOR SEQ ID NO : 1 4 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS : single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc = "primer" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
GGACGAATTC AARGTNCAYC ARYTNTGG 
(2) INFORMATION FOR SEQ ID NO : 1 5 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc = "primer" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15 
GGACGAATTC CYTCCCARAA RCANTC 
(2) INFORMATION FOR SEQ ID NO : 1 6 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc - "primer" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16 
GGACGAATTC YTNGANTGYT TYTGGGA 

(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



1° 



31 



<ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc = "primer' 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 
CATACCAGCC AAGCTTGTCN GGCCARTGCA T 
(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5288 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: CDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 
GAATTCCGGG GACCGCAAGG AGTGCCGCGG AAGCGCCCGA AGGACAGGCT CGCTCGGCGC 
GCCGGCTCTC GCTCTTCCGC GAACTGGATG TGGGCAGCGG CGGCCGCAGA GACCTCGGGA 
CCCCCGCGCA ATGTGGCAAT GGAAGGCGCA GGGTCTGACT CCCCGGCAGC GGCCGCGGCC 
GCAGCGGCAG CAGCGCCCGC CGTGTGAGCA GCAGCAGCGG CTGGTCTGTC AACCGGAGCC 
CGAGCCCGAG CAGCCTGCGG CCAGCAGCGT CCTCGCAAGC CGAGCGCCCA GGCGCGCCAG 
GAGCCCGCAG CAGCGGCAGC AGCGCGCCGG GCCGCCCGGG AAGCCTCCGT CCCCGCGGCG 
GCGGCGGCGG CGGCGGCGGC AACATGGCCT CGGCTGGTAA CGCCGCCGAG CCCCAGGACC 
GCGGCGGCGG CGGCAGCGGC TGTATCGGTG CCCCGGGACG GCCGGCTGGA GGCGGGAGGC 
GCAGACGGAC GGGGGGGCTG CGCCGTGCTG CCGCGCCGGA CCGGGACTAT CTGCACCGGC 
CCAGCTACTG CGACGCCGCC TTCGCTCTGG AGCAGATTTC CAAGGGGAAG GCTACTGGCC 
GGAAAGCGCC ACTGTGGCTG AGAGCGAAGT TTCAGAGACT CTTATTTAAA CTGGGTTGTT 
ACATTCAAAA AAACTGCGGC AAGTTCTTGG TTGTGGGCCT CCTCATATTT GGGGCCTTCG 
CGGTGGGATT AAAAGCAGCG AACCTCGAGA CCAACGTGGA GGAGCTGTGG GTGGAAGTTG 
GAGGACGAGT AAGTCGTGAA TTAAATTATA CTCGCCAGAA GATTGGAGAA GAGGCTATGT 
TTAATCCTCA ACTCATGATA CAGACCCCTA AAGAAGAAGG TGCTAATGTC CTGACCACAG 
AAGCGCTCCT ACAACACCTG GACTCGGCAC TCCAGGCCAG CCGTGTCCAT GTATACATGT 
ACAACAGGCA GTGGAAATTG GAACATTTGT GTTACAAATC AGGAGAGCTT ATCACAGAAA 102 0 

C AGGTTACAT GGATCAGATA ATAGAATATC TTTACCCTTG TTTGATTATT ACACCTTTGG 106C 
ACTGCTTCTG GGAAGGGGCG AAATTACAGT CTGGGACAGC ATACCTCCTA GGTAAACCTC 1140 
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CTTTGCGGTG GACAAACTTC GACCCTTTGG AATTCCTGGA AGAGTTAAAG AAAATAAACT 
ATCAAGTGGA CAGCTGGGAG GAAATGCTGA ATAAGGCTGA GGTTGGTCAT GGTTACATGG 
ACCGCCCCTG CCTCAATCCG GCCGATCCAG ACTGCCCCGC CACAGCCCCC AACAAAAATT 
CAACCAAACC TCTTGATATG GCCCTTGTTT TGAATGGTGG ATGTCATGGC TTATCCAGAA 
AGTATATGCA CTGGCAGGAG GAGTTGATTG TGGGTGGCAC AGTCAAGAAC AGCACTGGAA 
AACTCGTCAG CGCCCATGCC CTGCAGACCA TGTTCCAGTT AATGACTCCC AAGCAAATGT 
ACGAGCACTT CAAGGGGTAC GAGTATGTCT CACACATCAA CTGGAACGAG GACAAAGCGG 
CAGCCATCCT GGAGGCCTGG CAGAGGACAT ATGTGGAGGT GGTTCATCAG AGTGTCGCAC 
AGAACTCCAC TCAAAAGGTG CTTTCCTTCA CCACCACGAC CCTGGACGAC ATCCTGAAAT 
CCTTCTCTGA CGTCAGTGTC ATCCGCGTGG CCAGCGGCTA CTTACTCATG CTCGCCTATG 
CCTGTCTAAC CATGCTGCGC TGGGACTGCT CCAAGTCCCA GGGTGCCGTG GGGCTGGCTG 
XCCTGCT GGTTGCACTG TCAGTGGCTG CAGGACTGGG CCTGTGCTCA TTGATCGGAA 
S tttCCTTTAA CGCTGCAACA ACTCAGGTTT TGCCATTTCT CGCTCTTGGT GTTGGTGTGG 
K ATGATGTTTT TCTTCTGGCC CACGCCTTCA GTGAAACAGG ACAGAATAAA AGAATCCCTT 
2 TTGAGGACAG GACCGGGGAG TGCCTGAAGC GCACAGGAGC CAGCGTGGCC CTCACGTCCA 
| TCAGCAATGT CACAGCCTTC TTCATGGCCG CGTTAATCCC AATTCCCGCT CTGCGGGCGT 
^ TCTCCCTCCA GGCAGCGGTA GTAGTGGTGT TCAATTTTGC CATGGTTCTG CTCATTTTTC 
8 -TGCAATTCT CAGCATGGAT TTATATCGAC GCGAGGACAG GAGACTGGAT ATTTTCTGCT 
J GTTTTACAAG CCCCTGCGTC AGCAGAGTGA TTCAGGTTGA ACCTCAGGCC TACACCGACA 
g CACACGACAA TACCCGCTAC AGCCCCCCAC CTCCCTACAG CAGCCACAGC TTTGCCCATG 
AAACGCAGAT TACCATGCAG TCCACTGTCC AGCTCCGCAC GGAGTACGAC CCCCACACGC 
ACGTGTACTA CACCACCGCT GAGCCGCGCT CCGAGATCTC TGTGCAGCCC GTCACCGTGA 
CACAGGACAC CCTCAGCTGC CAGAGCCCAG AGAGCACCAG CTCCACAAGG GACCTGCTCT 
CCCAGTTCTC CGACTCCAGC CTCCACTGCC TCGAGCCCCC CTGTACGAAG TGGACACTCT 
CATCTTTTGC TGAGAAGCAC TATGCTCCTT TCCTCTTGAA ACCAAAAGCC AAGGTAGTGG 
TGATCTTCCT TTTTCTGGGC TTGCTGGGGG TCAGCCTTTA TGGCACCACC CGAGTGAGAG 
ACGGGCTGGA CCTTACGGAC ATTGTACCTC GGGAAACCAG AGAATATGAC TTTATTGCTG^ 
' CACAATTCAA ATACTTTTCT TTCTACAACA TGTATATAGT CACCCAGAAA GCAGACTACC 
CGAATATCCA GCACTTACTT TACGACCTAC ACAGGAGTTT CAGTAACGTG AAGTATGTCA 
TGTTGGAAGA AAACAAACAG CTTCCCAAAA TGTGGCTGCA CTACTTCAGA GACTGGCTTC 

ao-::;acttca ggatgcattt gacagtgact gggaaaccgg gaaaatcatg ccaaacaatt 

ACAAGAATGG ATCAGACGAT GGAGTCCTTG CCTACAAACT CCTGGTGCAA ACCGGCAGCC 
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GCGATAAGCC CATCGACATC AGCCAGTTGA CTAAACAGCG TCTGGTGGAT GCAGATGGCA 
TCATTAATCC CAGCGCTTTC TACATCTACC TGACGGCTTG GGTCAGCAAC GACCCCGTCG 
CGTATGCTGC CTCCCAGGCC AACATCCGGC CACACCGACC AGAATGGGTC CACGACAAAG 
CCGACTACAT GCCTGAAACA AGGCTGAGAA TCCCGGCAGC AGAGCCCATC GAGTATGCCC 
AGTTCCCTTT CTACCTCAAC GGGTTGCGGG ACACCTCAGA CTTTGTGGAG GCAATTGAAA 
AAGTAAGGAC CATCTGCAGC AACTATACGA GCCTGGGGCT GTCCAGTTAC CCCAACGGCT 
ACCCCTTCCT CTTCTGGGAG CAGTACATCG GCCTCCGCCA CTGGCTGCTG CTGTTCATCA 
GCGTGGTGTT GGCCTGCACA TTCCTCGTGT GCGCTGTCTT CCTTCTGAAC CCCTGGACGG 
CCGGGATCAT TGTGATGGTC CTGGCGCTGA TGACGGTCGA GCTGTTCGGC ATGATGGGCC 
TCATCGGAAT CAAGCTCAGT GCCGTGCCCG TGGTCATCCT GATCGCTTCT GTTGGCATAG 
GAGTGGAGTT CACCGTTCAC GTTGCTTTGG CCTTTCTGAC GGCCATCGGC GACAAGAACC 
GCAGGGCTGT GCTTGCCCTG GAGCACATGT TTGCACCCGT CCTGGATGGC GCCGTGTCCA 
CTCTGCTGGG AGTGCTGATG CTGGCGGGAT CTGAGTTCGA CTTCATTGTC AGGTATTTCT 
TTGCTGTGCT GGCGATCCTC ACCATCCTCG GCGTTCTCAA TGGGCTGGTT TTGCTTCCCG 
TGCTTTTGTC TTTCTTTGGA CCATATCCTG AGGTGTCTCC AGCCAACGGC TTGAACCGCC 
TGCCCACACC CTCCCCTGAG CCACCCCCCA GCGTGGTCCG CTTCGCCATG CCGCCCGGCC 
ACACGCACAG CGGGTCTGAT TCCTCCGACT CGGAGTATAG TTCCCAGACG ACAGTGTCAG 
GCCTCAGCGA GGAGCTTCGG CACTACGAGG CCCAGCAGGG CGCGGGAGGC CCTGCCCACC 
AAGTGATCGT GGAAGCCACA GAAAACCCCG TCTTCGCCCA CTCCACTGTG GTCCATCCCG 
AATCCAGGCA TCACCCACCC TCGAACCCGA GACAGCAGCC CCACCTGGAC TCAGGGTCCC 
TGCCTCCCGG ACGGCAAGGC CAGCAGCCCC GCAGGGACCC CCCCAGAGAA GGCTTGTGGC 
CACCCCTCTA CAGACCGCGC AGAGACGCTT TTGAAATTTC TACTGAAGGG CATTCTGGCC 
CTAGCAATAG GGCCCGCTGG GGCCCTCGCG GGGCCCGTTC TCACAACCCT CGGAACCCAG 
CGTCCACTGC CATGGGCAGC TCCGTGCCCG GCTACTGCCA GCCCATCACC ACTGTGACGG 
CTTCTGCCTC CGTGACTGTC GCCGTGCACC CGCCGCCTGT CCCTGGGCCT GGGCGGAACC 
CCCGAGGGGG ACTCTGCCCA GGCTACCCTG AGACTGACCA CGGCCTGTTT GAGGACCCCC 
ACGTGCCTTT CCACGTCCGG TGTGAGAGGA GGGATTCGAA GGTGGAAGTC ' ATTGAGCTGC 
AGGACGTGGA ATGCGAGGAG AGGCCCCGGG GAAGCAGCTC CAACTGAGGG TGATTAAAAT 
CTGAAGCAAA GAGGCCAAAG ATTGGAAACC CCCCACCCCC ACCTCTTTCC AGAACTGCTT 
GAAGAGAACT GGTTGGAGTT ATGGAAAAGA TGCCCTGTGC CAGGACAGCA GTTCATTGTT 
ACTGTAACCG ATTGTATTAT TTTGTTAAAT ATTTCTATAA ATATTTAAGA GATGTACACA 4 920 
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TGTGTAATAT AGGAAGGAAG GATGTAAAGT GGTATGATCT GGGGCTTCTC CACTCCTGCC 4 980 

CCAGAGTGTG GAGGCCACAG TGGGGCCTCT CCGTATTTGT GCATTGGGCT CCGTGCCACA 50 4 0 

ACCAAGCTTC ATTAGTCTTA AATTTCAGCA TATGTTGCTG CTGCTTAAAT ATTGTATAAT 510 0 

TTACTTGTAT AATTCTATGC AAATATTGCT TATGTAATAG GATTATTTTG TAAAGGTTTC 5160 

TGTTTAAAAT ATTTTAAATT TGCATATCAC AACCCTGTGG TAGTATGAAA TGTTACTGTT 522 0 

AACTTTCAAA CACGCTATGC GTGATAATTT TTTTGTTTAA TGAGCAGATA TGAAGAAAGC 52 8 0 

CCGGAATT 52 8 8 

(2) INFORMATION FOR SEQ ID NO : 1 9 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1447 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 9 : 

Met Ala Ser Ala Gly Asn Ala Ala Glu Pro Gin Asp Arg Gly Gly Gly 
15 10 15 

Gly Ser Gly Cys lie Gly Ala Pro Gly Arg Pro Ala Gly Gly Gly Arg 
20 25 30 

Arg Arg Arg Thr Gly Gly Leu Arg Arg Ala Ala Ala Pro Asp Arg Asp 
35 40 45 

Tyr Leu His Arg Pro Ser Tyr Cys Asp Ala Ala Phe Ala Leu Glu Gin 
50 55 60 

lie Ser Lys Gly Lys Ala Thr Gly Arg Lys Ala Pro Leu Trp Leu Arg 
65 70 75 80 

Ala Lys Phe Gin Arg Leu Leu Phe Lys Leu Gly Cys Tyr lie Gin Lys 
85 90 95 

Asn Cys Gly Lys Phe Leu Val Val Gly Leu Leu lie Phe Gly Ala Phe 
100 105 110 

Ala Val Gly Leu Lys Ala Ala Asn Leu Glu Thr Asn Val Glu Glu Leu 
115 120 125 

Trp Val Glu Val Gly Gly Arg Val Ser Arg Glu Leu Asn Tyr Thr Arg 
130 135 140 

Gin Lys lie Gly Glu Glu Ala Met Phe Asn Pro Gin Leu Met lie Gin 
145 150 155 160 

Thr Pro Lys Glu Glu Gly Ala Asn Val Leu Thr Thr Glu Ala Leu Leu 
165 170 175 



Gin His Leu Asp Ser Ala Leu Gin Ala Ser Arg Val His Val Tyr Met 




180 185 190 

Tyr Asn Arg Gin Trp Lys Leu Glu His Leu Cys Tyr Lys Ser Gly Glu 



195 



200 205 



Leu He Thr Glu Thr Gly Tyr Met Asp Gin He He Glu Tyr Leu Tyr 
210 215 220 

Pro Cys Leu He He Thr Pro Leu Asp Cys Phe Trp Glu Gly Ala Lys 
225 230 235 240 

Leu Gin Ser Gly Thr Ala Tyr Leu Leu Gly Lys Pro Pro Leu Arg Trp 
245 250 255 

Thr Asn Phe Asp Pro Leu Glu Phe Leu Glu Glu Leu Lys Lys He Asn 
260 265 270 

Tvr Gin Val Asp Ser Trp Glu Glu Met Leu Asn Lys Ala Glu Val Gly 
275 280 285 

His Gly Tyr Met Asp Arg Pro Cys Leu Asn Pro Ala Asp Pro Asp Cys 
290 295 300 

Pro Ala Thr Ala Pro Asn Lys Asn Ser Thr Lys Pro Leu Asp Met Ala 
305 310 315 320 

Leu Val Leu Asn Gly Gly Cys His Gly Leu Ser Arg Lys Tyr Met His 
325 330 335 

Trp Gin Glu Glu Leu He Val Gly Gly Thr Val Lys Asn Ser Thr Gly 
340 345 350 

Lys Leu Val Ser Ala His Ala Leu Gin Thr Met Phe Gin Leu Met Thr 
355 360 365 

Pro Lys Gin Met Tyr Glu His Phe Lys Gly Tyr Glu Tyr Val Ser His 
370 375 380 

He Asn Trp Asn Glu Asp Lys Ala Ala Ala He Leu Glu Ala Trp Gin 
385 390 395 400 

Arq Thr Tyr Val Glu Val Val His Gin Ser Val Ala Gin Asn Ser Thr 
405 410 415 

Gin Lys Val Leu Ser Phe Thr Thr Thr Thr Leu Asp Asp He Leu Lys 
420 425 430 

Ser Phe Ser Asp Val Ser Val He Arg Val Ala Ser Gly Tyr Leu Leu 
435 440 445 

Met Leu Ala Tyr Ala Cys Leu Thr Met Leu Arg Trp Asp Cys Ser Lys 
450 455 460 

Ser Gin Gly Ala Val Gly Leu Ala Gly Val Leu Leu Val Ala Leu Ser 
465 470 475 480 

Val Ala Ala Gly Leu Gly Leu Cys Ser Leu He Gly He Ser Phe Asn 
485 490 495 

Ala Ala Thr Thr Gin Val Leu Pro Phe Leu Ala Leu Gly Val Gly Val 
500 505 510 

Asp Asp Val Phe Leu Leu Ala His Ala Phe Ser Glu Thr Gly Gin Asn 





515 



520 



525 



Lys Arg He Pro Phe Glu Asp Arg Thr Gly Glu Cys Leu Lys Arg Thr 
530 535 540 

Gly Ala Ser Val Ala Leu Thr Ser He Ser Asn Val Thr Ala Phe Phe 

545 550 555 560 

Met Ala Ala Leu lie Pro He Pro Ala Leu Arg Ala Phe Ser Leu Gin 
565 570 575 

Ala Ala Val Val Val Val Phe Asn Phe Ala Met Val Leu Leu lie Phe 
580 585 590 

Pro Ala He Leu Ser Met Asp Leu Tyr Arg Arg Glu Asp Arg Arg Leu 
595 600 605 

Asp He Phe Cys Cys Phe Thr Ser Pro Cys Val Ser Arg Val He Gin 
610 615 620 

Val Glu Pro Gin Ala Tyr Thr Asp Thr His Asp Asn Thr Arg Tyr Ser 
625 630 635 640 

Pro Pro Pro Pro Tyr Ser Ser His Ser Phe Ala His Glu Thr Gin He 

645 650 655 

Thr Met Gin Ser Thr Val Gin Leu Arg Thr Glu Tyr Asp Pro His Thr 

660 665 670 

His Val Tyr Tyr Thr Thr Ala Glu Pro Arg Ser Glu He Ser Val Gin 
675 680 685 

Pro Val Thr Val Thr Gin Asp Thr Leu Ser Cys Gin Ser Pro Glu Ser 

690 695 ^ 700 

Thr Ser Ser Thr Arg Asp Leu Leu Ser Gin Phe Ser Asp Ser Ser Leu 
705 710 715 720 

His Cys Leu Glu Pro Pro Cys Thr Lys Trp Thr Leu Ser Ser Phe Ala 
725 730 735 

Glu Lys His Tyr Ala Pro Phe Leu Leu Lys Pro Lys Ala Lys Val Val 
740 745 750 

Val He Phe Leu Phe Leu Gly Leu Leu Gly Val Ser Leu Tyr Gly Thr 
755 760 765 

Thr Arg Vai Arg Asp Gly Leu Asp Leu Thr Asp He Val Pro Arg Glu 
770 775 780 

Thr Arg Glu Tyr Asp Phe He Ala Ala Gin Phe Lys Tyr Phe Ser Phe 
785 790 795 800 

Tyr Asn Met Tyr He Val Thr Gin Lys Ala Asp Tyr Pro Asn He Gin 
805 810 815 

His Leu Leu Tyr Asp Leu His Arg Ser Phe Ser Asn Val Lys Tyr Val 
820 825 830 

Met. Leu Giu Glu A sr. Lys Gin Leu Pro Lys Met Trp Leu His Tyr Phe 
83 5 840 845 

Arg Asp Trp Leu Gin Gly Leu Gin Asp Ala Phe Asp Ser Asp Trp Glu 
850 855 860 




It 




Thr Gly Lys He Met Pro Asn Asn Tyr Lys Asn Gly Ser Asp Asp Gly 
865 870 875 880 

Val Leu Ala Tyr Lys Leu Leu Val Gin Thr Gly Ser Arg Asp Lys Pro 
885 890 895 

He Asp He Ser Gin Leu Thr Lys Gin Arg Leu Val Asp Ala Asp Gly 
900 905 910 

He He Asn Pro Ser Ala Phe Tyr He Tyr Leu Thr Ala Trp Val Ser 
915 920 925 

Asn Asp Pro Val Ala Tyr Ala Ala Ser Gin Ala Asn He Arg Pro His 
930 935 940 

Arg Pro Glu Trp Val His Asp Lys Ala Asp Tyr Met Pro Glu Thr Arg 
945 950 955 960 

Leu Arg He Pro Ala Ala Glu Pro He Glu Tyr Ala Gin Phe Pro Phe 

965 970 975 

Tyr Leu Asn Gly Leu Arg Asp Thr Ser Asp Phe Val Glu Ala He Glu 
980 985 990 

Lys Val Arg Thr He Cys Ser Asn Tyr Thr Ser Leu Gly Leu Ser Ser 
995 1000 1005 

Tyr Pro Asn Gly Tyr Pro Phe Leu Phe Trp Glu Gin Tyr He Gly Leu 



Arg His Trp Leu Leu Leu Phe He Ser Val Val Leu Ala Cys Thr Phe 
1025 1030 1035 1040 

Leu Val Cys Ala Val Phe Leu Leu Asn Pro Trp Thr Ala Gly He He 
1045 1050 1055 

Val Met Val Leu Ala Leu Met Thr Val Glu Leu Phe Gly Met Met Gly 
1060 1065 1070 

Leu He Gly He Lys Leu Ser Ala Val Pro Val Val He Leu He Ala 
1075 1080 1085 

Ser Val Gly He Gly Val Glu Phe Thr Val His Val Ala Leu Ala Phe 
1090 1095 1100 

Leu Thr Ala He Gly Asp Lys Asn Arg Arg Ala Val Leu Ala Leu Glu 
1105 1110 1115 H20 

His Met Phe Ala Pro Val Leu Asp Gly Ala Val Ser Thr Leu Leu Gly 
1125 1130 1135 

Val Leu Met Leu Ala Gly Ser Glu Phe Asp Phe He Val Arg Tyr Phe 
1140 1145 1150 

Phe Ala Val Leu Ala He Leu Thr He Leu Gly Val Leu Asn Gly Leu 
1155 1160 1165 

Val Leu Leu Pro Val Leu Leu Ser Phe Phe Gly Pro Tyr Pro Glu Val 
1170 H75 1180 

Ser Pro Ala Asn Gly Leu Asn Arg Leu Pro Thr Pro Ser Pro Glu Pro 
1185 H90 1195 1200 



1010 



1015 



1020 



4 11 4 



Pro Pro Ser Val Val Arg Phe Ala Met Pro Pro Gly His Thr His Ser 
1205 1210 1215 

Gly Ser Asp Ser Ser Asp Ser Glu Tyr Ser Ser Gin Thr Thr Val Ser 
1220 1225 1230 

Gly Leu Ser Glu Glu Leu Arg His Tyr Glu Ala Gin Gin Gly Ala Gly 
1235 1240 1245 

Gly Pro Ala His Gin Val lie Val Glu Ala Thr Glu Asn Pro Val Phe 

1250 1255 1260 

Ala His Ser Thr Val Val His Pro Glu Ser Arg His His Pro Pro Ser 
1265 1270 1275 1280 

Asn Pro Arg Gin Gin Pro His Leu Asp Ser Gly Ser Leu Pro Pro Gly 
1285 1290 1295 

Arg Gin Gly Gin Gin Pro Arg Arg Asp Pro Pro Arg Glu Gly Leu Trp 
1300 1305 1310 

Pro Pro Leu Tyr Arg Pro Arg Arg Asp Ala Phe Glu lie Ser Thr Glu 
1315 1320 1325 

Gly His Ser Gly Pro Ser Asn Arg Ala Arg Trp Gly Pro Arg Gly Ala 

1330 1335 1340 

Arg Ser His Asn Pro Arg Asn Pro Ala Ser Thr Ala Met Gly Ser Ser 
1345 1350 1355 1360 

Val Pro Gly Tyr Cys Gin Pro lie Thr Thr Val Thr Ala Ser Ala Ser 
1365 1370 1375 

Val Thr Val Ala Val His Pro Pro Pro Val Pro Gly Pro Gly Arg Asn 
1380 1385 1390 

Pro Arg Gly Gly Leu Cys Pro Gly Tyr Pro Glu Thr Asp His Gly Leu 
1395 1400 1405 

Phe Glu Asp Pro His Val Pro Phe His Val Arg Cys Glu Arg Arg Asp 

1410 1415 1420 

Ser Lys Val Glu Val lie Glu Leu Gin Asp Val Glu Cys Glu Glu Arg 
1425 1430 1435 1440 



Pro Arg Gly Ser Ser Ser Asn 
1445 



